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Given C samples, with n; observations in the ith sample, 
a test of the hypothesis that the samples are from the same 
population may be made by ranking the observations from 
from 1 to >on (giving each observation in a group of ties the 
mean of the ranks tied for), finding the C sums of ranks, and 
computing a statistic H. Under the stated hypothesis, H is 
distributed approximately as x?(C —1), unless the samples are 
too small, in which case special approximations or exact 
tables are provided. One of the most important applications of 
the test is in detecting differences among the population 
means.* 


1. INTRODUCTION 
1.1. Problem 


COMMON problem in practical statistics is to decide whether 
several samples should be regarded as coming from the same 
population. Almost invariably the samples differ, and the question is 
whether the differences signify differences among the populations, or 
are merely the chance variations to be expected among random samples 
from the same population. When this problem arises one may often 
assume that the populations are of approximately the same form, «in 
the sense that if they differ it is by a shift or translation. 


1.2. Usual Solution 


The usual technique for attacking such problems is the analysis 
of variance with a single criterion of classification [46, Chap. 10]. The 
variation among the sample means, #;, is used to estimate the variation 
among individuals, on the basis of (i) the assumption that the varia- 
tion among the means reflects only random sampling from a popula- 
tion in which individuals vary, and (77) the fact that the variance of the 
means of random samples of size n; is o?/n; where o* is the population 
variance. This estimate of o? based on the variation among sample 
means is then compared with another estimate based only on the varia- 


* Based in part on research supported by the Office of Naval Research at the Statistical Research 
Center, University of Chicago. 

For criticisms of a preliminary draft which have led to a number of improvements we are in- 
debted to Maurice H. Belz (University of Melbourne), William G. Cochran (Johns Hopkins University), 
J. Durbin ‘London School of Economics), Churchill Eisenhart (Bureau of Standards), Wassily Hoeff- 
ding (University of North Carolina), Harold Hotelling (University of North Carolina), Howard L. 
Jones (Illinois Bell Telephone Company), Erich L. Lehmann (University of California), William G. 
Madow (University of Illinois), Henry B. Mann (Ohio State University), Alexander M. Mood (The 
Rand Corporation), Lincoln E. Moses (Stanford University), Frederick Mosteller (Harvard University), 
David L. Russell. (Bowdoin College), I. Richard Savage (Bureau of Standards), Frederick F. Stephan 
(Princeton University), Alan Stuart (London School of Economics), T. J. Terpstra (Mathematical 
Center, Amsterdam), John W. Tukey (Princeton University), Frank Wilcoxon (American Cyanamid 
Company), and C. Ashley Wright (Standard Oil Company of New Jersey), and to our colleagues K. A. 
Brownlee, Herbert T. David, Milton Friedman, Leo A. Goodman, Ulf Grenander, Joseph L. Hodges, 
Harry V. Roberts, Murray Rosenblatt, Leonard J. Savage, and Charles M. Stein. 
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tion within samples. The agreement between these two estimates is — 
tested by the variance ratio distribution with C—1 and N—C degrees 
of freedom (where N is the number of observations in all C samples 
combined), using the test statistic F(C—1, N—C). A value of F larger 
than would ordinarily result from two independent sample estimates 
of a single population variance is regarded as contradicting the hypoth- 
esis that the variation among the sample means is due solely to random 
sampling from a population whose individuals vary. 

When o? is known, it is used in place of the estimate based on the 
variation within samples, and the test is based on the x?(C—1) dis- 
tribution (that is, x? with C—1 degrees of freedom) using the test 
statistic 
— 


o?/ 


(1.1) x(C —1) = 


where #is the mean of all N observations. 


1.3. Advantages of Ranks 


Sometimes it is advantageous in statistical analysis to use ranks 
instead of the original observations—that is, to array the N observa- 
tions in order of magnitude and replace the smallest by 1, the next-to- 
smallest by 2, and so on, the largest being replaced by N. The ad- 
vantages are: 


(1). The calculations are simplified. Most of the labor when using ranks is in 
‘ making the ranking itself, and short cuts can be devised for this. For 
‘ example, class intervals can be set up as for a frequency distribution, 
and actual observations entered instead of tally marks. Another method is 
to record the observations on cards or plastic chips' which can be arranged 

in order, the cards perhaps by sorting devices. 

(2) Only very general assumptions are made about the kind of distributions 
from which the observations come. The only assuniptions underlying the 
use of ranks made in this paper are that the observations are all inde- 
pendent, that all those within a given sample come from a single popula- 
tion, and that the C populations are of approximately the same form. The 
F and x? tests described in the preceding section assume approximate nor- 
mality in addition. 

Data available only in ordinal form may often be used. 

When the assumptions of the usual test procedure are too far from reality, 
not only is there a problem of distribution theory if the usual test is used, 
but it is possible that the usual test may not have as good a chance as a 
rank test of detecting the kinds of difference of real interest. 


The present paper presents an analog, based on ranks and called the 
H test, to one-criterion variance analysis. 


We are indebted to Frank Wilcoxon for this suggestion. 
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1.4. The H Test 


The rank test presented here requires that all the observations be 
ranked together, and the sum of the ranks obtained for each sample. 
The test statistic to be computed if there are no ties (that is, if no two 
observations are equal) is 


3(N + 1) (no ti 
NIN +1) int 


(1.2) 


where 


C =the number of samples, 

n;=the number of observations in the ith sample, 

N=), the number of observations in all samples combined, 
R;=the sum of the ranks in the 7th sample. 


Large values of H lead to rejection of the null hypothesis. 

If the samples come from identical continuous populations and the 
n; are not too small, H is distributed as x?(C—1), permitting use of the 
readily available tables of x?. When the n; are small and C =2, tables 
are available which are described in Section 5.3. For C=3 and all 
n:<5, tables are presented in Section 6. For other cases where the x? 
approximation is not adequate, two special approximations are de- 
scribed in Section 6.2. 

li there are ties, each observation is given the mean of the ranks for 
which it is tied. H as computed from (1.2) is then divided by 


where the summation is over all groups of ties and7’=(t—1)t(t+1) 


=*—t for each group of ties, ¢ being the number of tied observations 
in the group. Values of 7 for ¢ up to 10 are shown in Table 1.1.? 


(1.3) 1 


TABLE 1.1 
(See Section 3.1.2) 


4 5 6 7 8 9 10 
6 24 60 120 210 336 504 720 990 


Since (1.3) must lie between zero and one, it increases (1.2). If all N 
observations are equal, (1.3) reduces (1.2) to the indeterminate form 
0/0. If there are no ties, each value of ¢t is 1 so )-T=0 and (1.2) is 


2 DuBois [4, Table I] gives values of T/12 (his ¢:) and 7°/6 (his cs) for ¢ (his N) from 5 to 50. 
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unaltered by (1.3). Thus, (1.2) divided by (1.3) gives a general ex- 
pression which holds whether or not there are ties, assuming that such 
ties as occur are given mean ranks: 

R? 


N(N + 1) ( 


In many situations the difference between (1.4) and (1.2) is negligible. 
A working guide is that with ten or fewer samples a x? probability of 
0.01 or more obtained from (1.2) will not be changed by more than 
ten per cent by using (1.4), provided that not more than one-fourth of 
the observations are involved in ties.* H for large samples is still dis- 
tributed as x?(C—1) when ties are handled by mean ranks; but the 
tables for small samples, while still useful, are no longer exact. 

For understaading the nature of H, a better formulation of (1.2) is 


N-1 n — 3(N + 


(N? — 1)/12 


(1.4) H = 


(no ties) 


where R; is the mean of the n; ranks in the ith sample. If we ignore the 
factor (VN —1)/N, and note that 3(N+1) is the mean and 7s(N?—1) 
the variance of the uniform distribution over the first N integers, we 
see that (1.5), like (1.1), is essentially a sum of squared standardized 
deviations of random variables from their population mean. In this 
respect, H is similar to x?, which is defined as a sum of squares of 
standardized normal deviates, subject to certain conditions on the 
relations among the terms of the sum. If the n,; are not too small, the 
R; jointly will be approximately normally distributed and the relations 
among them will meet the x? conditions. 


2, EXAMPLES 
2.1 Without Ties 


In a factory, three machines turn out large numbers of bottle caps. 
One machine is standard and two have been modified in different 
ways, but otherwise the machines and their operating conditions are 
identical. On any one day, only one machine is operated. Table 2.1 


3 Actually, for the case described it is possible for the discrepancy slightly to exceed ten per cent. 
For a given total number of ties, 8, the second term of (1 3) is 4 maximum if all S ties are in one group 
and this maximum, (S?—S)/(N?—N), is slightly less than (S/N)*. Thus, for S/N =}, (1.3) >63/64. 
The 0.01 level of x2(9) is 21.666. This divided by 63/64 is 22.0:0, for which the probability is 0.00885, 
a change of 114 per cent. For higher probability levels, fewer samples, or more than one group of ties, 
the percentage change in probability would be less. With the S ties divided into G groups, the second 
term of (1.3) is always less than [(S—A)*+4h]/N*, where h =2°@—1). 
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shows the production of the machines on various days and the calcu- 
lation of H as 5.656. The true probability, if the machines really are 
the same with respect to output, that H should be as large as 5.656 
is shown in Figure 6.1 and Table 6.1 as 0.049. The approximation to 
this probability given by the x?(2) distribution is 0.059. Two more- 
complicated approximations described in Section 6.2 give 0.044 and 
0.045. 


TABLE 2.1 


DAILY BOTTLE-CAP PRODUCTION OF THREE MACHINES. 
(Artificial data.) 


Standard Modification 1 Modification 2 
Output Rank Output Rank Output Rank 
340 5 339 + 347 10 
3465 9 333 2 343 7 
330 1 344 8 349 ll 
342 6 355 12 
338 3 
Sum 
n 5 3 4 12 
R 24 14 40 78 
R?*/n 115.2 65.333 400. 580.533 
Checks :-n=N =12 =4N(N +1) =78 
12 X580.533 
H= 12x13 —3 X13 =5.656-~x*(2) from (1.2) 
Pr[x?(2) 25.656] =0.059 from [9] or [13] 
Pr[H(5, 4, 3) 25.656] =0.049 from Table 6.1 


If the production data of Table 2.1 are compared by the conventional 
analysis of variance, F(2, 9) is 4.2284, corresponding to a probability 
of 0.051. 


2.2 With Ties 


Snedecor’s data on the birth weight of pigs [46, Table 10.12] are 
shown in Table 2.2, together with the calculation of H adjusted for 
the mean-rank method of handling ties. Here H as adjusted‘ is 18.566. 
The true probability in this case would be difficult to find, but the 

4 Note that, as will often be true in practice, the adjustment is not wortb the trouble even in this case: 
by changing H from 18.464 to 18.566, it changed the probability by only 0.0003, or 3 per cent. Since 


there are 47 ties in 13 groups, we see from the last sentence of note 3 that (1.3) cannot be less than 
1 —(23* /56%, which is 0.9302. 
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x?(7) approximation gives a probability of 0.010. Two better approxi- 
mations described in Section 6.2 give 0.006 and 0.005. 

The conventional analysis of variance [46, Sec. 10.8] gives F'(7, 48) 
= 2.987, corresponding with a probability of 0.011. 


3. JUSTIFICATION OF THE METHOD 
3.1. Two Samples 


The rationale of the H test can be seen most easily by considering 
the case of only two samples, of sizes n and N—n. As is explained in 
Section 5.3, the H test for two samples is essentially the same as a 
test published by Wilcoxon [61] in 1945 and later by others. 

In this case, we consider either one of the two samples, presumably 
the smaller for simplicity, and denote its size by n and its sum of ranks 
by R. We ask whether the mean rank of this sample is larger (or smaller) 
than would be expected if n of the integers 1 through N were selected 
at random without replacement. 

The sum of the first N integers is }N(N+1) and the sum of their 
squares is 3N(N+1)(2N +1). It follows that the mean and variance of 
the first N integers are }(N +1) and ps (N?—1). 

The means of samples of n drawn at random without replacement 
from the N integers will be normally distributed to an approximation 
close enough for practical purposes, provided that n and N —n are not 
too small. The mean of a distribution of sample means is, of course, the 
mean of the original distribution; and the variance of a distribution of 
sample means is (o?/n)[(N—n)/(N—1)], where o? is the population 
variance, N is the population size, and n is the sample size. In this case, 
o’=y,(N?—1), so 


-2) 
12n(N-1) 12n 
where cr? represents the variance of the mean of n numbers drawn at 


random without replacement from N consecutive integers. Letting R 
denote the mean rank for a sample of n, 


R — 3(N + 1) 
V(N'+ 1I)(N — n)/12n 
may be regarded as approximately a unit normal deviate. The square 


of (3.2) is H as given by (1.2) with’ C=2, and the square of a unit nor- 
mal deviate has the x?(1) distribution. 


5 This may be verified by replacing R in (3.2) by R/n and letting the two values of Rj in (1.2) be 
R and 4N(N +1) —R, with n and N —n the corresponding values of nj. 


(3.1) = 


(3.2) 
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Notice that this expression is the same, except for sign, whichever of 
the two samples is used to compute it. For if the first sample contains 
n ranks whose mean is R, the other sample must contain N —n ranks 
whose mean is 


4N(N + 1) 


and the value of (3.2) is changed only in sign if we interchange n and 
N—n, and replace R by (3.3). 

In the two-sample case the normal deviate is perhaps a little simpler 
to compute than is H; furthermore, the sign of the normal deviate is 
needed if a one-tail test is required. For computations, formula (3.2) 
may be rewritten 


(3.3) 


2R — n(N + 1) ; 
Vn(N + 1)(N — n)/3 


The null hypothesis is that the two samples come from the same popu- 
lation. The alternative hypothesis is that the samples come from popu- 
lations of approximately the same form, but shifted or translated with 
respect to each other. If we are concerned with the one-sided alterna- 
tive that the population producing the sample to which R and n relate 
is shifted upward, then we reject when (3.4) is too large. The critical 
level of (3.4) at the a level of significance is approximately K,, the unit 
normal deviate exceeded with probability a, as defined by 


(3.5) 

Values of (3.4) as large as K, or larger result in rejection of the null 
hypothesis. If the alternative is one-sided but for a downward shift, 
the null hypothesis is rejected when (3.4) is as small as — K, or smaller. 
If the alternative is two-sided and symmetrical, the null hypothesis is 
rejected if (3.4) falls outside the range — Ky. to + Kya. 

3.1.1. Continuity adjustment. It seems reasonable to expect that a 
continuity adjustment may be desirable, to allow for the fact that R, 
the sum of the ranks in one sample, can take only integral values, 
whereas the normal distribution is continuous.’ In testing against a 
two-sided alternative to the null hypothesis, the adjustment is made 


(3.4) 


* An extensive comparison of exact probabilities for the two-sample test [28] with those based on the 
normal approximation indicates that the normal approximation is usually better with the continuity 
adjustment when the probability is above 0.02, and better without it when the probability is 0.02 or 
below. This comparison was made for us by Jack Karush, who has also rendered invaluable assistance 
with numerous other matters in the preparation of this paper. 
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by increasing or decreasing R by }, whichever brings it closer to 
3n(N +1), before substituting into (3.4). (If R=3n(N+1), ignore the 
continuity adjustment.) With a one-sided alternative, R is increased 
(decreased) by } if the alternative is that the sample for which R is 
computed comes from the population which is to the left (right) of 
the other. 

3.1.2. Ties. If some of the N observations are equal, we suggest that 
each member of a group of ties be given the mean of the ranks tied for in 
that group. This does not affect the mean rank, 3(N+1). It does, 
however, reduce the variance below yy(N?—1). Letting T = (¢—1)t(t+1) 
for each group of ties, where ¢ is the number of tied observations in the 
group, and letting >-7 represent the sum of the values of 7 for all 
groups of ties, we have, instead of (3.1), 


N-n 


_2 = 
12Nn N-1 


as the variance of the mean rank for samples of n. When there are no 
ties, 07 =0 and (3.6) reduces to (3.1), so (3.6) may be regarded as the 
general expression for o,” when the mean-rank method is used for such 
ties as occur. Notice that (3.6) is the product of (3.1) and (1.3). 

& This adjustment comes about as follows:? The variance py(N?—1) is 
obtained by subtracting the square of the mean from the mean of the 
squares of N consecutive integers. If each of the ¢ integers (x+1) 
through (x++1) is replaced by z+3(t+1), the sum is not changed but 
the sum of the squares is reduced by 


T 
2) 12 42 


(3.7) > (2 + 1)? — 


So the mean of the squares, and consequently the variance, is reduced 
by T/12N. 

The mean-rank method of handling ties somewhat complicates the 
continuity adjustment, for the possible values of FR are no longer sim- 
ply the consecutive integers $n(n+1) to 4n(2N—n+1), nor need 
they be symmetrical about }(N+1). Our guess, however, is that it is 
better to make the +4 adjustment of Section 3.1.1. than not to make 
any. 


1 This is the adjustment alluded to by Friedman [10, footnote 11]. An equivalent adjustment for 
mean ranks has been suggested by Hemelrijk [16, formula (6)], but in a very complex form. A much 
simpler version of his formula (6) is obtained by multiplying our (3.6) by n*. The same adjustment has 
been suggested by Horn [18a]. 

This adjustment, however, goes back at least as far as a 1921 paper by ‘Student’ [48a], applying 
it to the Spearman rank correlation coefficient. For further discussion and other references, see Kendall 
(20, Chap. 3]. 
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An alternative method of handling ties is to assign the ranks at ran- 
dom within a group of tied observations. The distribution of H under 
the null hypothesis is then the same as if there had been no ties, since 
the null hypothesis is that the ranks are distributed at random. In or- 
der to use this method, adequate randomization must be provided with 
consequent complications in making and verifying computations. Some 
statisticians argue further that the introduction of extraneous random 
variability tends to reduce the power of a test. We do not know whether 
for the H test random ranking of ties gives more or less power than 
mean ranks; indeed, it may be that the answer varies from one alterna- 
tive hypothesis to another and from one significance level to another.*® 
When all members of a group of ties fall within the same sample, every 
assignment of their ranks gives rise to the same value of H, so that it 
might be thought artificial in this instance to use mean-ranks; even 
here, however, an eristic argument can be made for mean ranks, on the 
ground that H interprets a particular assignment of ranks against the 
background of all possible assignments of the same ranks to samples of 
the given sizes, and some of the possible assignments put the ties into 
different samples.® 

3.1.3. Examples. (i) As a first example consider a particularly simple 
one discussed by Pitman [41]. 


TABLE 3.1 
PITMAN EXAMPLE (41, p. 122] 


Sample A Sample B 


Observation Rank Observation Rank 


0 16 
11 19 
12 22 
20 24 

29 


n=4, R=12 


8 A few computations for simple distributions and small samples, some carried out by Howard L. 
Jones and some by us, showed mean ranks superior sometimes and random ranks others. For theoretical 
purposes, random ranking of ties is much easier to handle.For practical purposes, it should be remem- 
bered that there will ordinarily be little difference between the two methods; see notes 3 and 4. Compu- 
tational considerations, therefore, lead us to suggest the mean-rank method. 

Ranking of tied observations at random should be distinguished from increasing the power of a test 
by rejecting or accepting the null hypothesis on the basis of an ancillary random device, in such a way 
as to attain a nominal significance level which, because of discontinuities, could not otherwise be at- 
tained. Discussions of this are given by Eudey [6] and E. 8. Pearson [37]. 

® This is illustrated in the calculation of the exact probability for the data of Table 3.2. 
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If we use the two-tail H test without adjustment for continuity, we 
compute the approximate unit-normal deviate from (3.4): 

(2 12) — (4 X 10) 16 
= — = — 1.9596 (no adjustment) . 
/(4 X 10 X 5)/3 /200/3 
corresponding to a two-tail normal probability of 0.0500. 

If we make the continuity adjustment, we get: 


(2 X 12.5) — (4 X 10) i 15 (continuity 
/(4 X 10 X 5)/3 »/200/3 adjustment) 


corresponding to a two-tail norma! probability of 0.0662. 

Actually, since the samples are so small, it is easy to compute the 
true probability under the null hypothesis of a value of R as extreme as, 
or more extreme than, 12. There are 9!/41!5! or 126 ways of selecting 
four ranks from among the nine, and all 126 ways are equally probable 
under the null hypothesis. Only four of the 126 lead to values of R of 
12 or less. By symmetry another set of 4 lead to values as extreme but 
in the opposite direction, that is, n(N-+1)—R=28 or more. Hence the 
true probability to compare with the foregoing approximations is 
8/126, or 0.06349. This value can also be obtained from the tables 
given by Mann and Whitney [28]; they show 0.032 for one tail, and 
when doubled this agrees, except for rounding, with our calculation.’ 

(it) A second, and more realistic, example will illustrate the kind of 


TABLE 3.2 
BROWNLEE EXAMPLE [2, p. 36] 


Method A Method B 
Value Rank Value Rank 
95.6 94 93.3 4 
94.9 7 92.1 3 
96.2 12 94.7 54 
95.1 8 90.1 2 
95.8 11 95.6 94 
96.3 13 90.0 1 
94.7 54 


R =604, n= 6, 


= 


1 Pitman [41] gives a test which is like H except thatit considers possible permutations of the actual 
observations instead of their ranks. For the example of Table 3.1, Pitman’s test yields a two-tail prob- 
ability of 5/126 or 0.03968. 
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complication that arises in practice. Table 3.2 shows the results of two 
alternative methods of technical chemical analysis. Since there are ties 
(two groups of two ties), mean ranks are used. 
If we use (3.4) without adjusting either for continuity or for the use 
of mean ranks, we obtain as our approximate unit-normal deviate 
121 — 84 37 ts) 
——————— = — = 2.6429 no adjustmen 
14 
which corresponds to the two-tail normal probability of 0.0082. 
If we use the adjustment for mean ranks, we find that >) 7 =12, so 
(3.6) gives og = 1.1635 and the denominator of (3.4), which is 


(3.8) = 2nog, 


is adjusted to 13.9615. This leads to the approximate unit-normal de- 
viate 
121 — 84 
13.9615 


corresponding to a two-tail probability of 0.0080—not appreciably dif- 
ferent from the result without the adjustment. 

The continuity adjustment is not desirable in this case, since the 
probability level is appreciably less than 0.02.6 The comments of Sec- 
tion 3.1.2 about irregularities in the sequence of possible values of R 
also apply. For purely illustrative purposes, however, we note that the 
effect of the continuity adjustment would be to reduce R from 60} to 
60, resulting in an approximate normal deviate of 


120 — 84 
13.9615 


for which the symmetrical two-tail normal probability is 0.0099. 

The true probability in this case can be computed by considering all 
possible sets of six that could be selected from the 13 ranks 1, 2, 3, 4, 
53, 54, 7, 8, 94, 93, 11, 12, 18. There are 13!/6!7! or 1716 such sets, all 
equally probable under the null hypothesis. Six of them give rise to 
values of R greater than or equal to 603, and five give rise to values of 
R less than or equal to 233, which is as far below as 603 is above 
34n(N+1). Hence the true probability is 11/1716, or 0.00641. 


= 2.6501 (adjusted for mean ranks) 


= 2.5785 (adjusted for continuity and mean ranks) 


3.2. Three Samples 


When there are three samples, we may consider the average ranks for 
any two of them, say the ith and jth. The other sample, the kth, 
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would not tell us anything we cannot find out from two, for its mean 
rank must be z 
N(N + 1) — (nj R 
(3.9) z=! (N + 1) — (niki + 1; ‘) 
— (ni + nj) 


If the n’s are not too small, the joint distribution of R; and R; will be 
approximately that bivariate normal distribution whose exponent is 


(R 
1 2 


— oR, 
aed N+1 N+1 N+1 
— 2p + - 
CR ORI TR, 


The variances needed in (3.10) are given by (3.1) and the correlation 
by 


which is the correlation between the means of samples of sizes n; and 
n; when all n;+7; are drawn at random without replacement from a 
population of N elements." Thus the exponent (3.10) of the bivariate 
normal distribution which approximates the joint distribution of R; 
and R; is, when multiplied by —2, 


N(N + 1)(N — — nj) nj 2 
N+1\/_ N+1 
+R 


It is well known that —2 times the exponent of a bivariate normal dis- 


(3.12) 


1 Although (3.11) is easily derived and is undoubtedly familiar to experts on sampling from finite 
populations, we have not found it in any of the standard treatises. It is a special case of a formula used 
by Neyman [47, p.39] in 1923, and a more general case of one used by K. Pearson [38] in 1924. For assist- 
ance in trying to locate previous publications of (3.11) we are indebted to Churchill Eisenhart, Tore 
Dalenius (Stockholm), W. Edwards Deming (Bureau of the Budget), P. M. Grundy (Rothamsted 
Experimental Station) who told us of [38], Morris H. Hansen (Bureau of the Census), Maurice G. 
Kendall (London School of Economics), Jerzy Neyman (University of California) who told us of [47], 
June H. Roberts (Chicago), Frederick F. Stephan who provided a compact derivation of his own, 
John W. Tukey, and Frank Yates (Rothamsted Experimental Station). 
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tribution has the x?(2) distribution [32, Sec. 10.10]. Hence (3.12) could 
be taken as our test statistic for the three-sample problem, and ap- 
proximate probabilities found from the x? tables. 

From the relations 


(3.13) niRi + + mR, = + 1) 
and 
(3.14) + +n = N 


it can be shown that the value of (3.12) will be the same whichever pair 
of samples is used in it, and that this value will be H as given by (1.2) 
with C=3. For computing, (1.2) has the advantages of being simpler 
than (3.12) and of treating all (R, n) pairs alike. 

With three or more samples, adjustments for continuity are unim- 
portant except when the n; are so small that special tables of the true 
distribution should be used anyway. 

Since the adjustment for the mean-rank method of handling ties is 
a correction to the sum of squares of the N ranks, it is the same for 
three or more groups as for two. The variances given by (3.1) for the 
case without ties are replaced by (3.6) when there are ties; hence (1.2) 
with mean ranks should be divided by (1.3) to give H as shown by 
(1.4). 


3.3. More than Three Samples 


Nothing essentially new is involved when there are more than three 
samples. If there are C samples, the mean ranks for any C—1 of them 
are jointly distributed approximately according to a multivariate nor- 
mal distribution, provided that the sample sizes are not too small. The 
exponent of this (C —1)-variate normal distribution will have the same 
value whichever set of C—1 samples is used. This value, when multi- 
plied by —2, will be H as given by (1.2), and it will be distributed ap- 
proximately as x?(C—1), provided the n; are not too small. The ex- 
ponent of the approximating multivariate normal distribution is more 
complicated than for three samples, but it involves only the variances 
of the R; as given by (3.6) and the correlations among pairs (Ri, R;) as 
given by (3.11). 

By using matrix algebra, the general formula for H is obtained quite 
as readily as the formulas for two and three samples by the methods 
used in this paper. A mathematically rigorous discussion of H for the 
general case of C samples is presented elsewhere by Kruskal [25], to- | 
gether with a formal proof that its distribution under the null hy- 
pothesis is asymptotically x’. 
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4. INTERPRETATION OF THE TEST 
4.1. General Considerations 


H tests the null hypothesis that the samples all come from identical 
populations. In practice, it will frequently be interpreted, as is F in the 
analysis of variance, as a test that the population means are equal 
against the alternative that at least one differs. So to interpret it, how- 
ever, is to imply something about the kinds of differences among the 
populations which, if present, will probably lead to a significant value 
of H, and the kinds which, even if present, will probably not lead to a 
significant value of H. To justify this or any similar interpretation, we 
need to know something about the power of the test: For what alterna- 
tives to identity of the populations will the test probably lead to rejec- 
tion, and for what alternatives will it probably lead to acceptance of 
the null hypothesis that the populations are identical? Unfortunately, 
for the H test as for many nonparametric tests the power is difficult to 
investigate and little is yet known about it. 

It must be recognized that relations among ranks need not conform 
to the corresponding relations among the data before ranking. It is 
possible, for example, that if an observation is drawn at random from 
each of two populations, the one from the first population i_ larger in 
most pairs, but the average of those from the second population is 
larger. In such a case the first population may be said to have the 
higher average rank but the lower average value. 

It has been shown by Kruskal [25] that a necessary and sufficient 
condition for the H test to be consistent” is that there be at least one 
of the populations for which the limiting probability is not one-half 
that a random observation from this population is greater than an in- 
dependent random member of the N sample observations. Thus, what 
H really tests is a tendency for observations in at least one of the popu- 
lations to be larger (or smaller) than all the observations together, 
when paired randomly. In many cases, this is practically equivalent to 
the mean of at least one population differing from the others. 


4.2. Comparison of Means when Variability Differs 


Rigorously interpreted, all we can conclude from a significant value 
of H is that the populations differ, not necessarily that the means dif- 
fer. In particular, if the populations differ in variability we cannot, 


12 A test is consistent against an alternative if, when applied at the same level of significance for 
increasing sample size, the probability of rejecting the null hypothesis when the alternative is true ap- 
proaches unity. Actually, the necessary and sufficient condition stated here must be qualified in a way 
that is not likely to affect the interpretation of the H test suggested in this paragraph. An exact state- 
ment is given in [25]. 
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strictly speaking, infer from a significant value of H that the means 
differ. In the data of Table 3.2, for example, the variances of the two 
chemical methods differ significantly (normal theory probability less 
than 0.01) and substantially (by a factor pf 16), as Brownlee shows 
(2). A strict interpretation of H and its probability of less than 0.01 
does not, therefore, justify the conclusion that the means of the two 
chemical methods differ. 

There is some reason to conjecture, however, that in practice the H 
test may be fairly insensitive to differences in variability, and so may 
be useful in the important “Behrens-Fisher problem” of comparing 
means without assuming equality of \variances. Perhaps, for example, 
we could conclude that the means of the two chemical methods of 
Table 3.2 differ. The following considerations lend plausibility to this 
conjecture (and perhaps suggest extending it to other differences in 
form): 

(i) The analysis of consistency referred to in Section 4.1 shows that 
if two symmetrical populations differ only by a scale factor about their 
common mean the H test is not consistent for small significance levels; 
in other words, below a certain level of significance there is no assur- 
ance that the null hypothesis of identical populations will be rejected, 
no matter how large the samples. 

(it) Consider the following extreme case: Samples of eight are drawn 
from two populations having the same mean but differing so much in 
variability that there is virtually no chance that any of the sample from 
the more variable population will lie within the range of the other sam- 
ple. Furthermore, the median of the more variable population is at the 
common mean, so that its observations are as likely to lie above as to 
lie below the range of the sample from the less variable population. The 
actual distribution of H under these assumptions is easily computed 
from the binomial distribution with parameters 8 and 3. Figure 4.1 
shows the exact distribution of H under the null hypothesis that the 
two populations are completely identical, under the symmetrical al- 
ternative just described, and under a similar but skew alternative in 
which the probability is 0.65 that an observation from the more varia- 
ble population will lie below the range of the other sample and 0.35 
that it will lie above. Possible values of H under each hypothesis are 
those at which occur the risers in the corresponding step function of 
Figure 4.1, and the probabilities at these possible values of H are given 
by the tops of the risers. Figure 4.1 shows, for example, that samples in 
which seven observations from the more variable population lie above 
and one lies below the eight observations from the less variable popula- 
tion (so that the two values of R are 44 and 92, leading to an H of 
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6.353) would be judged by the H test to be significant at the 0.010 leve! 
using true probabilities (or at the 0.012 level using the x? approxima- 
tion), while such samples will occur about seven per cent of the time 
under the symmetrical alternative and about seventeen per cent under 
the other. In view of the extreme difference of the variances assumed 
in the alternatives, it seems rather striking that the cumulative distri- 
butions given in Figure 4.1 do not differ more than they do. At least 
in the case of the symmetrical alternative, the distribution for the null 
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Figure 4.1. Distribution of H for two samples of 8, under the null hypothesis 
that the populations are identical] and under two alternatives in which the means 
are the same but the variances are extremely different. (For further specification 
of the alternatives, see Section 4.2.) 


hypothesis seems not too poor a partial smoothing, though on the whole 
it lies too low. 

The applicability of the H test to the Behrens-Fisher problem, par- 
ticularly in its two-tail form, merits further investigation. 


6. RELATED TESTS 


5.1. Permutation Tests and Ranks 


The H test stems from two statistical methods, permutations of the 
data, and rank transformations. 
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Permutation tests, which to the best of our knowledge were first 
proposed by Fisher [8] in connection with a defense of the normality 
assumption, accept or reject the null hypothesis according to the prob- 
ability of a test statistic among all relevant permutations of the ob- 
served numbers; a precise general formulation of the method is given 
by Scheffé [45]. Applications of the permutation method to important 
cases may be found in articles by Pitman [41, 42, 43] and by Welch 
[57]. 

The use of ranks—or more generally, of conventional numbers—in- 
stead of the observations themselves has been proposed often, and we 
do not know to whom this idea may be credited." Its advantages have 
been summarized in Section 1.3. Its disadvantage is loss of information 
about exact magnitudes. 

If in one-criterion variance analysis the permutation method based 
on the conventional F statistic is combined with the rank method, the 
result is the H test. 


5.2. Friedman’s x? 


Two kinds of data must be distinguished in discussing tests for the 
equality of C population averages. The first kind consists of C inde- 
pendent random samples, one from each population. The second kind 
consists of C samples of equal size which are matched (that is, cross- 
classified or stratified, each stratum contributing one observation to 
each sample) according to some criterion which may affect the values 
of the observations. This distinction is, of course, exactly that between 
one-criterion variance analysis with equal sample sizes and two-crite- 
rion variance analysis with one observation per cell. 

For comparing the weights of men and women, data of the first kind 
might be obtained by measuring a random sample of m men and an 
independent random sample of nz women. Such data would ordinarily 
be analyzed by one-criterion variance analysis, as described in Section 
1.2 above, which in the two-sample case is equivalent to the two-tail ¢ 
test with ni+n2—2 degrees of freedom. The H test, or the two-sample 
version of it given by (3.4), would also be applicable. 

Data of the second kind for the same problem might be obtained by 
selecting n ages (not necessarily all different) and for each age selecting 
at random one man and one woman. Such data would ordinarily be 


3 Our attention has been directed by Harold Hotelling to the use of ranks by Galton [12, Chaps. 
4 and 5] in 1889. Churchill Eisenhart and I. Richard Savage have referred us to the extensive analyses of 
ranks by eighteenth century French mathematicians in connection with pref dering problems, 
specifically elections. The earliest work they mention is by Borda [1] in 1770, and they mention also 
Laplace [26] in 1778, Condorcet [3] in 1786, and Todhunter’s summary of these and related writings. 
[51, Secs. 690, 806, 989, 990]. Systematic treatment of ranks as a nonparametric statistical device, 
however, seems to commence with the work of Hotelling and Pabst [19] in 1936. 
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analyzed by two-criterion variance analysis, the between-sexes com- 
ponent being the one tested. This test would be equivalent to the two- 
tail ¢ test of the mean difference, with n—1 degrees of freedom. Fried- 
man’s x,” [10], or the two-tail sign test which is its two-sample version, 
would be appropriate.“ 

The H test thus provides a rank test for data of the first kind, just 
as the x,” test does for data of the second kind. H makes it possible to 
test by ranks the significance of a grouping according to a single cri- 
terion. The effect of one criterion cannot be tested by x,? unless the ob- 
servations in the different groups are matched according to a second 
criterion. On the other hand, if the data are matched H is not appro- 
priate and x,’ should be used. 


5.3. Wilcoxon’s Two-Sample Test 


The H test in its general form is new, as far as we know," but not its 
two-sample form. 

5.3.1. Wilcoxon (1945, 1947). Wilcoxon was the first, we believe, to 
introduce the two-sample form. His first paper [61] considers the case 
of two samples of equal size and gives true probabilities for values of 
the smaller sum of ranks in the neighborhood of the 0.01, 0.02, and 0.05 
probability levels for sample sizes from 5 to 10. A method of calculating 
the true probabilities is given. An example uses the mean-rank method 
for ties, interpreting the result in terms of a table for the no-ties situa- 
tion. 

In a second paper [62] on the case of two equal samples, Wilcoxon 
gives a normal approximation to the exact distribution, basing it on 
the theory of sampling without replacement from a finite uniform popu- 
lation, along the lines of Section 3.1 of the present paper. A table of 5 
per cent, 2 per cent, and 1 per cent significance levels for the smaller to- 
tal is given, covering sample sizes from 5 to 20. 

5.3.2. Festinger (1946).° Wilcoxon’s test was discovered independ- 
ently by Festinger [7], who considers the case where the two sample 
sizes, n and m, are not necessarily equal. He gives a method of calculat- 
ing true probabilities, and a table of two-tail 5 per cent and 1 per cent 


4 For other discussions of x,*, see Kendall and Smith [21], Friedman [11], and Wallis [55]. 

6 After an abstract [24] of a theoretical version [25] of the present paper was published we learned 
from T. J. Terpstra that similar work has been done at the Mathematical Center, Amsterdam, and that 
papers closely related to the H test will be published soon by himself [50] and by P. G. Rijkoort [44]; 
also that P. van Elteren and A. Benard are doing some research related to xr*. References [50] and [44] 
propose tests based upon statistics similar to, but not identical with, H. 

Alan Stuart tells us that H. R. van der Vaart (University of Leiden) has been planning a generaliza- 
tion of the Wilcoxon test to several samples. 

P. V. Krishna Iyer has announced [23] “a non-parametric method of testing k samples.” This 
brief announcement is not intelligible to us, but it states that “full details will be published in the 
Journal of the Indian Society of Agricultural Research.” 

6 We are indebted to Alan Stuart for calling our attention to Festinger’s paper. 
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significance levels for n from 2 to 12 with m from n to 40—n, and for n 
from 13 to 15 with m from n to 30—n; and more extensive tables are 
available from him. A large proportion of the entries in Festinger’s 
table, especially at the 5 per cent level, seem to be slightly erroneous.” 

5.3.3. Mann and Whitney (1947). Mann and Whitney [28] made an 
important advance in showing that Wilcoxon’s test is consistent for the 
null hypothesis that the two populations are identical against the al- 
ternative that the cumulative distribution of one lies entirely above 
that of the other.” They discuss the test in terms of a statistic U which, 
as they point out, is equivalent to Wilcoxon’s sum of ranks (our 2). 
When all observations from both samples are arranged in order, they 
count for each observation in one sample, say the first, the number of 
observations in the second sample that precede it. The sum of these 
counts for the first sample is called U. It is related to R, the sum of the 
ranks for the first sample, by’® 
n(n + 1) 

2 


They give a table showing the one-tail probability to three decimals for 
each possible value of U, for all combinations of sample sizes in which 
the larger sample is from three to eight.” 

Hemelrijk [16] has pointed out recently that U, and consequently R 
for the two-sample case, may be regarded as a special case of Kendall’s 
coefficient of rank correlation [20]. 

5.3.4. Haldane ari Smith (1948).2° Haldane and Smith [14] devel- 
oped the Wilcoxon test independently in connection with the problem 
of deciding whether the probability of a hereditary trait appearing in a 
particular member of a sibship depends on his birthrank. They propose 
a test based on the sum of the birth-ranks of those members of a sibship 
having the trait—i.e., our R—where N is the number in the sibship and 
n is the number having the trait. They develop an approximate distri- 
bution from the theory of sampling from an infinite, continuous, uniform 
population, and approximate this by the unit normal deviate given in 


(5.1) U=R- 


17 Actually the test is consistent under more general conditions; see Section 5.3.6 (iv). 
18 Mann and Whitney’s version of this formula is a trifle different because they relate the count in 
the first sample (our terminology) to the sum of ranks in the other sample. 
a 19 We have recomputed the Mann-Whitney table to additional decimals. It agrees entirely with 
computations. 
2» We are indebted to Alan Stuart for calling our attention to the Haldane and Smith paper. 

t Blair M. Bennett, University of Washington, is computing power functions for the Wilcoxon test 
against alternatives appropriate to the birth-order problem. Bennett emphasizes, in a personal communi- 
cation, that the distribution of R under the null hypothesis corresponds to a partition problem which 
has been studied in the theory of numbers for centuries—in particular by Euler [6a, Chap. 16], who in 
1748 considered closely related partition problems and their generating functions, and by Cauchy [2a, 
Numbers 225, 226]. In fact, Euler [6a, p. 252*] gives a table which is in part equivalent to that of Mann 
and Whitney [28]. This number-theoretic approach is discussed by Wilcoxon [61]. 
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this paper as (3.4)—including the continuity adjustment, which they 
seem to be the first to use. They tabulate the means and variances of 
6R for values of N from 2 to 20, with n from 1 to N. They also give a 
table of exact probabilities (not cumulated) for all possible values of 
n up to N=12. 

Haldane and Smith discuss the problem of ties in connection with 
multiple births. They propose to assign to each member of each birth 
the rank of that birth. In our terminology, they give each member of 
a tied group the lowest of the ranks tied for, and give the next individ- 
ual or group the next rank, not the rank after the highest in the group 
tied for. For a test in this case, they refer to the theory of sampling 
without replacement from a finite but non-uniform population. 

With the Haldane-Smith method of handling ties, the difference be- 
tween the ranks of two non-tied observations is one more than the 
number of distinct values or groups intervening between the two, re- 
gardless of the number of intervening individuals; with the mean-rank 
method, the difference is one more than the number of observations in- 
tervening, plus half the number of other observations having the same 
rank as either of the two observations being compared. The mean-rank 
method seems preferable when the cause of ties is measurement limita- 
tions on an effectively continuous variable, the Haldane-Smith method 
when the cause is actual identity. Unfortunately, the Haldane-Smith 
method does not lend itself so readily as does the mean-rank method 
to simple adjustment of the formulas for the no-ties case, since the 
necessary adjustments depend upon the particular ranks tied for, not 
merely the number of ties. 

5.3.5. White (1952). Tables of critical values of R at two-tail sig- 
nificance levels of 5, 1, and 0.1 per cent for all sample sizes in which 
N S30 are given by White [59].2! He suggests that ties be handled by 
the mean-rank method, not allowing for its effect on the significance 
level, or else by assigning the ranks so as to maximize the final proba- 
bility, which may then be regarded as an upper limit for the true prob- 
ability. 

5.3.6. Power of Wilcoxon’s test. The power of nonparametric tests in 
general, and of the H test in particular, is difficult to investigate; but 

21 Comparison of the 5 and 1 per cent levels given by White with Festinger’s earlier and more ex- 
tensive table [7] shows 104 disagreements among 392 comparable entries (78 disagreements among 196 
comparisons at the 5 per cent level, and 26 among 196 at 1 per cent). In each disagreement, Festinger 
gives a lower critical value of the statistic, although both writers state that they have tabulated the 
smallest value of the statistic whose probability does not exceed the specified significance level. Three 
of the disagreements can be checked with the Mann-Whitney table [28]; in all three, White’s entry 
agrees with Mann-Whitney’s. In one additional case (sample sizes 4 and 11 at the 1 per cent level) we 
have made our own calculation and found Festinger’s entry to have a true probability (0.0103) ex- 


ceeding the stated significance level. The disagreements undoubtedly result from the fact that the dis- 
tributions are discontinuous, so that exact 5 and 1 per cent levels cannot ordinarily be attained. 
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for the special case of Wilcoxon’s two-sample test certain details have 
been discovered. Some that are interesting from a practical viewpoint 
are indicated below, but without the technical qualifications to which 
they are subject: 

(i) Lehmann [27] has shown that the one-tail test is unbiased—that 
is, less likely to reject when the null hypothesis is true than when any 
alternative is true—but van der Vaart [52] has shown that the corre- 
sponding two-tail test may be biased. 

(i?) Lehmann [27] has shown, on the basis of a theorem of Hoeff- 
ding’s [17], that under reasonable alternative hypotheses, as under the 
null hypothesis, the distribution of »/H is asymptotically normal. 

(iit). Mood [33] has shown that the asymptotic efficiency of Wil- 
coxon’s test compared with Student’s test, when both populations are 
normal with equal variance, is 3/7, i.e., 0.955. Roughly, this means 
that 3/7 is the limiting ratio of sample sizes necessary for the two tests 
to attain a fixed power. This result was given in lecture notes by 
E. J. G. Pitman at Columbia University in 1948; it was also given by 
van der Vaart [52]. To the best of our knowledge, Mood’s proof is the 
first complete one. 

(iv) Lehmann [27] and van Dantzig [15, 51a], generalizing the find- 
ings of Mann and Whitney [28], have shown that the test is consistent!” 
if the probability differs from one-half that an observation from the first 
population will exceed one drawn independently from the second popu- 
lation (for one-tail tests the condition is that the probability differ from 
one-half in a stated direction). In addition van Dantzig [51a] gives 
inequalities for the power. The C-sample condition for consistency 
given by Kruskal (see Section 4.1) is a direct extension of the two 
sample condition given by Lehmann and van Dantzig. 


5.4. Whitney’s Three-Sample Test 


Whitney [60] has proposed two extensions of the Wilcoxon test to the 
three-sample case. Neither of his extensions, which are expressed in 
terms of inversions of order rather than in terms of ranks, is equivalent 
to our H test for C=3, since Whitney seeks tests with power against 
more specific alternatives than those appropriate to the H test. 

Whitney arrays all three samples in a single ranking and then defines 
U as the number of times in which an observation from the second 
sample precedes an observation from the first and V as the number of 
times in which an observation from the third sample precedes one from 
the first.” 


2 U and V are not determined by R:, Rs, and Rs, nor vice versa, though 
UO+V — + 1) 
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this paper as (3.4)—including the continuity adjustment, which they 
seem to be the first to use. They tabulate the means and variances of 
6R for values of N from 2 to 20, with n from 1 to N. They also give a 
table of exact probabilities (not cumulated) for all possible values of 
n up to N=12. 

Haldane and Smith discuss the problem of ties in connection with 
multiple births. They propose to assign to each member of each birth 
the rank of that birth. In our terminology, they give each member of 
a tied group the lowest of the ranks tied for, and give the next individ- 
ual or group the next rank, not the rank after the highest in the group 
tied for. For a test in this case, they refer to the theory of sampling 
without replacement from a finite but non-uniform population. 

With the Haldane-Smith method of handling ties, the difference be- 
tween the ranks of two non-tied observations is one more than the 
number of distinct values or groups intervening between the two, re- 
gardless of the number of intervening individuals; with the mean-rank 
method, the difference is one more than the number of observations in- 
tervening, plus half the number of other observations having the same 
rank as either of the two observations being compared. The mean-rank 
method seems preferable wher the cause of ties is measurement limita- 
tions on an effectively continuous variable, the Haldane-Smith method 
when the cause is actual identity. Unfortunately, the Haldane-Smith 
method does not lend itself so readily as does the mean-rank method 
to simple adjustment of the formulas for the no-ties case, since the 
necessary adjustments depend upon the particular ranks tied for, not 
merely the number of ties. 

5.3.5. White (1962). ‘Tables of critical values of R at two-tail sig- 
nificance levels of 5, 1, and 0.1 per cent for all sample sizes in which 
N S30 are given by White [59].21 He suggests that ties be handled by 
the mean-rank method, not allowing for its effect on the significance 
level, or else by assigning the ranks so as to maximize the final proba- 
bility, which may then be regarded as an upper limit for the true prob- 
ability. 

5.3.6. Power of Wilcoxon’s test. The power of nonparametric tests in 
general, and of the H test in particular, is difficult to investigate; but 


21 Comparison of the 5 and 1 per cent levels given by White with Festinger’s earlier and more ex- 
tensive table [7] shows 104 disagreements among 392 comparable entries (78 disagreements among 196 
comparisons at the 5 per cent level, and 26 among 196 at 1 per cent). In each disagreement, Festinger 
gives a lower critical value of the statistic, although both writers state that they have tabulated the 
smallest value of the statistic whose probability does not exceed the specified significance level. Three 
of the disagreements can be checked with the Mann-Whitney table [28]; in all three, White’s entry 
agrees with Mann-Whitney’s. In one additional case (sample sizes 4 and 11 at the 1 per cent level) we 
have made our own calculation and found Festinger’s entry to have a true probability (0.0103) ex- 
ceeding the stated significance level. The disagreements undoubtedly result from the fact that the dis- 
tributions are discontinuous, so that exact 5 and 1 per cent levels cannot ordinarily be attained. 
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for the special case of Wilcoxon’s two-sample test certain details have 
been discovered. Some that are interesting from a practical viewpoint 
are indicated below, but without the technical qualifications to which 
they are subject: 

(t) Lehmann [27] has shown that the one-tail test is unbiased—that 
is, less likely to reject when the null hypothesis is true than when any 
alternative is true—bu: van der Vaart [52] has shown that the corre- 
sponding two-tail test may be biased. 

(ii) Lehmann [27] has shown, on the basis of a theorem of Hoeff- 
ding’s [17], that under reasonable alternative hypotheses, as under the 
null hypothesis, the distribution of »/H is asymptotically normal. 

(iit). Mood [33] has shown that the asymptotic efficiency of Wil- 
coxon’s test compared with Student’s test, when both populations are 
normal with equal variance, is 3/z, i.e., 0.955. Roughly, this means 
that 3/7 is the limiting ratio of sample sizes necessary for the two tests 
to attain a fixed power. This result was given in lecture notes by 
E. J. G. Pitman at Columbia University in 1948; it was also given by 
van der Vaart [52]. To the best of our knowledge, Mood’s proof is the 
first complete one. 

(iv) Lehmann [27] and van Dantzig [15, 51a], generalizing the find- 
ings of Mann and Whitney [28], have shown that the test is consistent!” 
if the probability differs from one-half that an observation from the first 
population will exceed one drawn independently from the second popu- 
lation (for one-tail tests the condition is that the probability differ from 
one-half in a stated direction). In addition van Dantzig [51a] gives 
inequalities for the power. The C-sample condition for consistency 
given by Kruskal (see Section 4.1) is a direct extension of the two 
sample condition given by Lehmann and van Dantzig. 


5.4. Whitney’s Three-Sample Test 


Whitney [60] has proposed two extensions of the Wilcoxon test to the 
three-sample case. Neither of his extensions, which are expressed in 
terms of inversions of order rather than in terms of ranks, is equivalent 
to our H test for C=3, since Whitney seeks tests with power against 
more specific alternatives than those appropriate to the H test. 

Whitney arrays all three samples in a single ranking and then defines 
U as the number of times in which an observation from the second 
sample precedes an observation from the first and V as the number of 
times in which an observation from the third sample precedes one from 
the first.” 


2 U and V are not determined by R:, Rs, and Rs, nor vice versa, though 
U+V — jni(m +1) 
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Whitney’s first test, which rejects the null hypothesis of equality of 
the populations if both U and V are too small (alternatively, too large), 
is suggested when the alternative is that the cumulative distribution 
of the first population lies above (alternatively, below) those of both 
the second and third populations. His second test, which rejects if U is 
too large and V is too small, is suggested when the alternative is that 
the cumulative distribution of the first population lies below that of the 
second and above that of the third. 


5.5 Terpstra’s C-sample Test. 


Terpstra [50a] has proposed and investigated a test appropriate for 
alternatives similar to those of Whitney’s second test, but extending 
to any number of populations. 


5.6. Mosteller’s C-Sample Test 


Mosteller [34] has proposed a multi-decision procedure for accepting 
either the null hypothesis to which the H test is appropriate or one of 
the C alternatives that the ith population is translated to the right (or 
left) of the others. His criterion is the number of observations in the 
sample containing the largest observation that exceed all observations 
in other samples. This procedure has been discussed further by Mos- 
teller and Tukey [35]. 


5.7. Fisher and Yates’ Normalized Ranks 


Fisher and Yates have proposed [9, Table XX] that each observa- 
tion be replaced not by its simple rank but by a normalized rank, de- 
fined as the average value of the observation having the corresponding 
rank in samples of N from a normal population with mean of zero and 
standard deviation of one. They propose that ordinary one-criterion 
variance analysis then be applied to these normalized ranks. Ehrenberg 
[5] has suggested as a modification using the values of a random sam- 
ple of N from the standardized normal population. 

Two advantages might conceivably be gained by replacing the obser- 
vations by normalized ranks or by some other set of numbers instead 
of by simple ranks. First, it might be that the distribution theory would 
be simplified. Quite a large class of such transformations, for example, 
lead to tests whose distribution is asymptotically x?(C—1); but for 
some transformations the x? approximation may be satisfactory at 
smaller sample sizes than for others, thus diminishing the area of need 
for special tables and approximations such as those prevented in Sec. 6. 
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Second, the power of the test might be greater against important 
classes of alternatives. 

Whether either of these possible advantages over ranks is actually 
realized by normalized ranks, or by any other specific transformation, 
has not to our knowledge been investigated. Offhand, it seems intui- 
tively plausible that the x? distribution might be approached more 
rapidly with normalized ranks, or some other set of numbers which 
resemble the normal form more than do ranks. On the other hand, it 
seems likely that if there is such an advantage it is not very large, 
partly because the distribution of means from a uniform population 
approaches normality rapidly as sample size increases, and partly be- 
cause (as Section 6 indicates) the distribution of H approaches the x? 
distribution quite rapidly as sample sizes increase. As to power, we 
have no suggestions, except the obvious one that the answer is likely 
to differ for different alternatives of practcal interest.” 


5.8. Other Related Tests 


A number of tests have been proposed which have more or less the 
same purpose as H and are likewise non-parametric. We mention here 
only two of the principal classes of these. 

5.8.1. Runs. Wald and Wolfowitz [53] have proposed for the two- 
sample case that all observations in both samples be arranged in order 
of magnitude, that the observations then be replaced by designations 
A or B, according to which sample they represent, and that the num- 
ber of runs (i.e., groups of consecutive A’s or consecutive B’s) be used 
to test the null hypothesis that both samples are from the same popula- 
tion. The distribution theory of this test has been discussed by Stevens 
[48], Wald and Wolfowitz [53], Mood [31], Krishna Iyer [22], and oth- 
ers; and Swed and Eisenhart [49] have provided tables covering all 
cases in which neither sample exceeds 20. For larger samples, normal 
approximations are given by all the writers mentioned. Wald and 
Wolfowitz discussed the consistency of the test, and later Wolfowitz 
[63] discussed its asymptotic power. An extension to cases of three or 
more samples has been given by Wallis [56], based on the distribution 
theory of Mood and Krishna Iyer. 

5.8.2. Order statistics. Westenberg [58] has suggested a test for the 
two-sample case utilizing the number of observations in each sample 
above the median of the comuined samples. Mood and Brown [82, pp. 


% When the true distributions are normal, Hoeffding [18] has shown that in many cases, including 
at least some analysis of variance ones, the test based on normalized ranks becomes as powerful as that. 
based on the actual observations, when the sample sizes increase toward infinity. 
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394-5, 398-9] have discussed the test further and generalized it to sev- 
eral samples. Massey [29] has generalized the test further by using 
other order statistics of the combined samples as a basis for a more re- 
fined classification. 


6. SIGNIFICANCE LEVELS, TRUE AND APPROXIMATE 
6.1. True Significance Levels 


6.1.1. Two samples. Festinger [7], Haldane and Smith [14], Mann 
and Whitney [28], White [59], and Wilcoxon [61, 62] have published 
tables for the two-sample case. These are described in Section 5.3. 
They are exact only if ties are absent or are handled by the random- 
rank method, but our guess is that they will also serve well enough if 
the mean-rank method is used and there are not too many ties. 

6.1.2. Three samples. (i) Five or fewer observations in each sample. For 
each of these cases, Table 6.1 shows three pairs of values of H and their 
probabilities of being equalled or exceeded if the null hypothesis is 
true*. Hach pair brackets as closely as possible the 10, 5, or 1 per cent 
level, except that in some cases one or both members of a pair are miss- 
ing because H can take only a small number of values. The final sen- 
tence of Section 6.1.1, about ties, applies to Table 6.1 also. 

(ti) More than five observations in each sample. No exact tables are 
available for these cases. Our recommendation is that the x? approxima- 
tion be used. Only at very small significance levels (less than 1 per cent, 
say) and sample sizes only slightly above five is there likely to be ap- 
preciable advantage to the more complicated I and B approximations 
described in Section 6.2. This recommendation is based only on the 
comparisons shown in Table 6.1, no true probabilities having beea 
computed in this category. 

(ti7) Intermediate cases. No exact tables are available here. The T 
and B approximations probably should be resorted to if more than 
roughly approximate probabilities are required. Except at very low 
significance levels or with very small samples, the I approximation, 
which is simpler, should serve. This recommendation is not very firm, 
however, since we have computed no true probabilities in this category. 

6.1.3. More than three samples. Since we have computed no true 
probabilities for more than three samples, our recommendations here 


2% These computations and others used for this paper were made by John P. Gilbert with the assist- 
ance of Billy L. Foster, Thomas O. King, and Roland Silver. Space prevents reproducing all or even 
most of the results, but we hope to file them in such a way that interested workers may have access to 
them. We have the true joint distributions of R:, Rs, and Rs under the null hypothesis for m:, m:, and ns, 
each from 1 through 5, and the true distribution of H under the same conditions, except that for some 
cases we have probabilities only for those values of H exceeding the upper twenty per cent level. 


f 
| 
i 

ee 

oF 

@ 

| 

.. 
4 

Wee 

- 

1 ® 


1952 
Sev- 
ising 


USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS 609 


must be entirely tentative. It seems safe to use the x? approximation 
when all samples are as large as five. If any sample is much smaller than 
five, the T or B approximation should probably be used, especially at 
low significance levels, though the importance of this presumably is less 
the larger the proportion of samples of more than five. 


6.2. Approximate Significance Levels 


6.2.1. x? approximation. This is the approximation discussed in Sec- 
tions 1, 2, and 3. The most extensive single table is that of Hald and 
Sinkbaek [13], though the table in almost any modern statistics text 


will ordinarily suffice. 
6.2.2. T approximation. This utilizes the incomplete-I distribution 


by matching the variance as well as the true mean of H. The mean, or 
expected value, of H under the null hypothesis is [25] 


(6.1) E=C-1 
and the variance is 


(6.2) 


5N(N + 1) 5 ix] Ni 


One way of applying the approximation is to enter an ordinary x? table 
taking x*=2HE/V and degrees of freedom f=2E?/V. Note that the 
degrees of freedom will not ordinarily be an integer, so interpolation 
will be required in both x? and the degrees of freedom if the four bound- 
ing tabular entries do not define the probability accurately enough.” 

6.2.3. B approximation. This utilizes the incomplete-B distribution 
by matching the true maximum as well as the mean and variance of H. 
The maximum value of H is [25] 


(+4 
N*— 
f=] 


6.3 M = . 
6.3) N(N + 1) 

To apply the approximation, K. Pearson’s table of the incomplete-B 
distribution [39] may be employed, but it is usually more convenient to 
use the F distribution, a form of the incomplete-B distribution, since 


% The I approximations shown in Table 6.1 were based on K. Pearson's table of the incomplete-I’ 
function [40]. In Pearson's notation, the required probability is 1 —I(u, p), where u=H/ VV and p= 
E?/V —1. We used linear double interpolation, which on a few tests seemed to be satisfactory in the 
region of interest. 
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tables of F are widely accessible to statisticians.”* We set 


H(M — 
with degrees of freedom (not usually integers) 
—E)-V 
(6.5) fii=E iMV 
E(M—E)-V_M—E 
(6.6) fo = (M — E)- fi. 


The probability may then be obtained by three-way interpolation in 
the F tables or by using Paulson’s approximation [36], according to 
which the required probability, P, is the probability that a unit normal 
deviate will exceed 


(1 2/9fr)F’ + 1 
+ 


(6.7) Kp 


where F’ = 

As an illustration, suppose C=3, m=5, n.=4, ns=3, and H =5.6308. 
From (6.1), (6.2), and (6.3) we find H=2, V =3.0062, and M =9.6923. 
Substituting these into (6.4), (6.5), and (6.6) gives F = 5.332, f: = 1.699, 
and f2=6.536. Then (6.7) gives Kp=1.690, for which the normal dis- 
tribution shows a probability of 0.046. This may be compared with the 
true probability of 0.050, the x? approximation of 0.060, and the T ap- 
proximation of 0.044, shown in Table 6.1.7” 


%* The most detailed table of the F distribution is that of Merrington and Thompson [30]. 

27 The B approximations shown in Table 6.1 are based on K. Pearson's table of the incomplete-B 
function [39]. In Pearson’s notation, the required probability is 1—J;(p, g), where z=H/M, p=}/:, 
and q=4f/:. To simplify the three-way interpolation, the following device (based on the relation of the 
incomplete-B to the binomial distribution, and of the binomial to the normal distribution) was used: 
First, let po, go, and 2 be the tabulated arguments closest to p, g, and z, and as a first approximation 
to the required probability take 1 —Iz,(pe, go). Second, add to this first approximation the probabil- 
ity that a unit normal deviate will not exceed (in algebraic, not absolute, value) 


p—4—2(p+¢@—1) 


/ — 2)(p 1) 


Third, subtract from this the probability that a unit normal deviate will not exceed Ke, where Ko is 
defined like K but in terms of po, go, and 2. This method of interpolation was compared at three points 
with the trivariate Everett formula to third differences as presented by Pearson [39, Introduction]. The 
results were not excellent, but seemed to suffice for the present purposes. 

Strictly speaking, all our statements and numerical results concerning the B approximation (in- 
cluding entries in Table 6.1) actually apply to that approximation based on Pearson’s tables in combina- 
tion with this normal interpolation device. 

Values calculated in this way will not in general agree precisely with those calculated by inter- 
polating in the F tables or by using Paulson’s approximation, though the example in the text agrees to 
three decimals. 
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Fiaure 6.3. Comparison of the true and approximate significance probabilities 
for H in the neighborhoods of the 1, 5, and 10 per cent points, for three samples 
of sizes 2 to 5. Crosses indicate that the smallest sample size exceeds 2, circles 
that it is 2. Cases involving samples of 1 and a few involving samples of 2 have 


been omitted. 
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TABLE 6.1 


TRUE DISTRIBUTION OF H FOR THREE SAMPLES, EACH OF 
SIZE FIVE OR LESS, IN THE NEIGHBORHOOD OF THE 
10, 5, AND 1 PER CENT POINTS; AND COM- 
PARISON WITH THREE APPROXIMATIONS 
The probabilities shown are the probabilities under the null hypothesis 
that H will equal or exceed the values in the column headed “H” 


Approximate minus true 
Sample Sizes True probability 


ne ns 


r B 
(Linear (Normal 
Interp.) Interp.) 


| 
2 1 1 2.7000 .500 —.241 | —.309 | —.500 
pes { 2 2 1 | 8.6000 .267 —.101 | —.167 | —.267 
2 2 4.5714 067 | +.035 | —.007 | —.067 
fey 3.7143 .200 —.044 | —.083 | +.010 
3 1 1 38.2000 .300 | —.098 | -.180 | —.300 
3 2 1 4.2857 100 | +.017 | —.040 | —.100 
3.8571 1133 | +.012 | —.045 | —:042 
3 2 2 5.8572 029 | +.040 | +.083 | —.029 
4.7143 .048 + .047 +.012 +.014 
vl 4.5000 1067 | +.039 | +.003 | +.020 
4.4643 105 | +.002 | —.033 | —.014 
a 3 3 1 5.1429 043 | +.034 | -.010 | —.043 
4.5714 100 | —:046 | —-062 
4.0000 129 +.007 | —.041 | —.024 
3 3 2 6.2500 011 +.033 | +.012 | —.o11 
oe 5.3611 .032 +.036 | +.010 | +.001 H 
5.1389 061 +:016 | —.012 | —.019 i 
4.5556 100 | +.002 | —.027 | —.020 
4.2500 121 —'002 | —.031 | —.014 
3 3 3 | 7.2000 | +.024 | +.010 | —.004 
6.4889 ‘011 +1028 | +.011 | —.001 
5.6889 .029 +.030 | +.009 | +.003 
5.6000 050 | —:010 | —.015 
5.0667 .086 —.006 | —.029 | —.026 
| 4.6222 :100 —001 | —.025 | —.010 
4 1 1 3.5714 .200 —.032 | —.114 | —.200 
4 2 4.8214 057 +.033 | -.017 | —.057 
4.5000 076 | +.029 | —.022 | —.047 
4.0179 114 | +.020 | -—:032 | —.056 | 
a 4 2 2 | 6.0000 014 | +.036 | +.010 | —.014 
a 5.3333 .033 +.036 | +.007 —.017 E 
5.1250 052 +1025 | —:006 | —.021 | 
4.3750 100 | +.012 | -.020 | —.002 
| 4.1667 105 | +.020 | —.012 | +.014 
: 
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TABLE 6.1 (Continued) 


Sample Sizes 


m 


ns 


True 
Proba- 
bility 


Approximate minus true 


probability 


(Linear 
Interp.) 


B 
(Normal 
Interp.) 


.021 


.050 
.057 


.093 
.129 


.009 
.010 


-052 


.098 
-101 


.010 
.013 


-046 
.050 


-094 


.010 


88 


bd 


++4++ 


-001 
-016 
— .016 


— .005 
.028 


.012 
-005 
.008 
— .020 
.020 


.010 
-007 
-009 
-012 
— .021 
.027 


.002 
.005 
-003 
-005 
.002 


-010 
.006 
-002 
.002 


.021 


-037 
.034 


.014 
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r 
4 38 1 5.8333 + .033 
5.2083 +.024 
50000 | | $2025 
4.0556 +.039 
3.8889 +.014 
5.4444 —.010 
5.4000 +.016 —.013 
4.5111 — .004 
4.4667 — .003 
6.7091 — .003 
5.7909 — .013 
5.7273 —.015 
4.7091 — .006 
4.7000 —.012 
4 4 1 6.6667 + — .010 . 
6.1667 .022 — .020 
4.1667 082 42 | + +.016 
4.0667 +.029 | — + .007 
4 4 2 | 7.0364 006 +.024 | + .002 
; 6.8727 ‘011 +.021 | + — .005 
i 5.4545 .046 +.020 | — — .003 
5.2364 +.021 | +.001 
4.5545 .098 + .005 — .019 — .003 
4.4455 .103 + .006 — .018 + .000 
4 4 38 | 7.1439 .010 +.018 | +.007 | —.002 
7.1364 011 +.018 | +.006 | —.003 
; 5.5985 .049 +.012 — .005 — .004 
5.5758 .051 +.011 — .006 — .005 
4.5455 .099 +.004 | —.015 | +.003 
4.4773 +.004 | -—.014 | +.004 
4 4 4 7.6538 .008 +.014 | +.005 .000 
7.5385 ‘011 +.012 | +.003 | —.002 
5.6923 .049 +.009 | —.006 | —.002 
5.6538 054 +.005 | -—.010 | —.007 
4.6539 .097 + .001 — .015 + .004 
4.5001 +.001 | —.015 | +.007 
5 i 1 3.8571 .148 +.003 | —.109 | —.143 
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TABLE 6.1 (Continued) 


Approximate minus true 


Sample Sizes True 
H Proba- r 
bility x? (Linear | (Normal 
Interp.) Interp.) 
5.2500 .036 +.037 | —.006 | —.036 
5.0000 .048 +.034 | +.011 | —.037 
4.4500 .071 + .037 — .012 — .020 
4.2000 .095 + .027 — .022 — .018 
4.0500 .119 + .013 — .036 — .024 
5 2 2 | 6.5333 .008 +.030 | +.010 | —.008 
6.1333 .013 + .033 --.010 — .010 
5.1600 .034 + .041 + .013 + .008 
5.0400 056 +:025 | —.004 | —.006 
4.3733 .090 +.022 | —.007 | +.010 
4.2933 .122 — .005 — .034 — .014 
5 3 1 6.4000 .012 + .029 + .002 — .012 
4.9600 .048 + .036 — .004 — .010 
4.8711 +.036 | --.004 | —.009 
4.0178 +.039 | -—.002 | +.018 
3. .123 + .024 — .016 + .010 
5 3 2 | 6.9091 .009 +.023 |.+.007 | —.006 
6.8218 .010 + .023 + .007 — .006 
5.2509 .049 + .023 — .000 +.001 
5.1055 052 +.026 | +.003 | +.006 
4.6509 .091 +.006 | —.018 | —.005 
4.4945 .101 + .005 — .020 — .003 
5 3 3 6.9818 .010 + .020 + .008 — .NO2 
6.8606 .O11 + .022 + .008 — .t91 
5.4424 .048 +.018 — .000 + .002 
5.3455 "050 +:019 | +.000 | +.004 
4.5333 .097 +.007 | —.013 | +.004 
4.4121 -109 +.001 — .018 + .000 
5 4 1 6.9545 .008 +.023 | +.002 | —.008 
6.8400 + .022 — .000 -.011 
4.9855 .044 + .038 + .002 —.001 : 
4.8600 .056 + .032 — .005 — .005 : 
3.9873 .098 + .038 +.001 +.018 
3.9600 .102 + .036 — .000 + .018 
5 4 2 7.2045 .009 +.018 + .005 — .005 ; 
7.1182 ‘010 +:018 | +.005 | —.005 
5.2727 .049 +.023 | +.002 | +.005 
5.2682 .050 + .021 + .000 + .004 
4.5409 .098 + .005 — .017 — .002 
4.5182 -101 + .004 — .018 —.002 
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TABLE 6.1 (Continued) 


saad Approximate minus true 
Sample Sizes True probability 
= H Proba- r B 
al “ % bility x? (Linear Normal 
) | Interp.) nterp.) 
7 5 4 8 7.4449 010 +.014 | +.004 | —.004 
7.3949 011 +.014 + .004 — .004 
) 5.6564 .049 +.010 — .005 — .004 
' 5.6308 .050 +.010 — .006 — .004 
1 4.5487 .099 + .004 ~ £3 + .003 
4.5231 .103 +.001 — .016 — .000 
) 5 4 4 7.7604 .009 +.011 + .003 — .002 
7.7440 O11 + .010 + .002 — .003 
5.6571 .049 +.010 — .004 +.000 
| 5.6176 .050 + .010 — .004 +.001 
4.6187 .100 —.001 — .016 + .003 
4.5527 .102 +.001 —.014 +.005 
| 5 5 1 7.3091 .009 +.016 — .002 — .009 
| 6.8364 O11 + .022 +.001 — .009 
5.1273 .046 + .031 — .003 — .005 
4.9091 .053 + .032 — .002 — .002 
4.1091 .086 + .042 + .007 + .020 
4.0364 .105 + .028 — .007 + .008 
5 5 2 7.3385 .010 + .016 + .004 — .004 
7.2692 .010 + .016 + .004 — .004 
5.3385 .047 + .022 + .003 + .006 
5.2462 051 + .022 + .002 + .007 
4.6231 097 +.002 —.018 — .005 
4.5077 100 + .005 — .016 —.001 
5 5 8 7.5780 010 +.013 +.004 —.001 
7.5429 010 +.013 + .004 — .002 
5.7055 046 +.012 — .003 + .000 
5.6264 051 + .009 — .005 .002 
4.5451 100 +.003 — .012 + .007 
4.5363 102 + .002 —.014 + .005 
5 5 4 7.8229 010 +.010 | +.003 | —.002 | 
i 7.7914 .010 +.010 + .003 — .002 
5.6657 .049 +.010 — .003 +.001 
; 5.6429 .050 +.009 — .003 +.001 
4.5229 .099 + .005 009 +.010 
4.5200 .101 + .004 — .010 + .008 
8.0000 .009 + .009 + .003 — .002 
7.9800 .010 + .008 + .002 — .003 
| 5.7800 .049 + .007 — .005 — .001 
. i 5.6600 .051 + .008 — .004 +.001 
4.5600 .100 + .003 — .010 + .008 
H 4.5000 102 + .004 — .009 + .009 
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6.3. Comparisons of True and Approximate Significance Levels 


Figures 6.1 and 6.2 show the true probabilities and the x?, I, and B 
approximations when the sample sizes are 3, 4, and 5, and when they 
are all 5.78 

For each entry in Table 6.1 the probabilities given by the three ap- 
proximations have been computed and their errors recorded in the last 
three columns of the table. In Figure 6.3 these errors are graphed 
against the true probabilities. To avoid confusing this figure, sample 
sizes have not been indicated ; cases involving samples of one have been 
omitted, and cases involving samples o’ two have been distinguished 
from those in which the smallest samp! : exceeds two. 
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SERIAL NUMBER ANALYSIS 


Leo A. GoopMAn* 
University of Chicago 

The problem discussed is that of sampling without replace- 
ment from a discrete, finite, uniform population. One source of 
this problem is the analysis of serial numbers on manufac- 
tured items in order to estimate the total number of items 
manufactured. Minimum variance unbiased estimators of the 
parameters are obtained and compared with other estimators 
which have been suggested. Tests of hypothesis and confidence 
intervals are also discussed. 
4, “NTRODUCTION AND SUMMARY 

ARLY in 1943 the Economic Warfare Division of the American Em- 

bassy in London started to analyze markings and serial numbers 
obtained from captured German equipment in order to obtain esti- 
mates of German war production and capacity. Richard Ruggles and 
Henry Brodie have described, in an exciting and interesting paper [1], 
the story of the development of this technique in terms of the prob- 
lems which arose and the ways in which they were solved. They point 
out that the relative accuracy of the serial number estimates, evaluated 
in terms of the official statistics on German war production obtained 
after the war, indicates that this method of analysis was a valuable 
source of economic intelligence. Within the limits of its capabilities, the 
technique of analyzing markings on German equipment was superior 
to the more abstract methods of intelligence. 

Ruggles and Brodie suggest implicitly that one of the following sta- 
tistical models may be applied in many situations: 

(a) Initial number known. The n observed serial numbers, a, a2, 
dz, - * * , @n, may be considered a random sample of n from the integers 
s+1, s+2, s+3, ---+,s+p, where s is known and p is unknown. 

(b) Initial number unknown. The n serial numbers may be consid- 
ered a random sample from the integers s+1, - - - , s+p, where both 
s and p are unknown. 

The statistical problem in both (a) and (b) is to estimate p from the 
serial numbers obtained. As an estimator of p when (a) applies, Ruggles 
and Brodie implicitly suggest [1, p. 82] the intuitively reasonable sta- 
tistic r obtained by adding the average gap between the sample of 


* This paper was prepared in connection with research sponsored by the Office of Naval Research. 
The author is indebted to bis colleagues in the Statistical Research Center, University of Chicago, and 
to Richard Savage of tbe Bureau of Standards, for helpful comments. 
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serial numbers to the largest serial number obtained. We shall show 
that this estimate is unbiased, and that its efficiency is high for sam- 
ples of large or moderate size. We shall show also that the most efficient 
unbiased estimator e is very simply obtained by subtracting one from 
(n+1)/n times the largest serial number obtained. With an appro- 
priate definition of “gap” (which differs from the definition used in ob- 
taining r) the efficient estimator e coincides with the statistic suggested 
by Ruggles and Brodie. The loss of efficiency incurred by using r in- 
stead of e is n~*. The variances of the estimators r and e will be pre- 
sented and we will give simple efficient unbiased estimators of the 
variances. The variances of the estimators of the variances will also 
be presented, and we will obtain efficient unbiased estimators of these 
variances. We shall also see how to test certain hypotheses about p, 
and hence how to obtain confidence regions for p. 

Ruggles and Brodie also discuss the problem of estimating the pro- 
duction of separate factories on the basis of serial numbers which have, 
by ingenious devices, been identified as to their factory origin. The 
results obtained herein also apply to this case and we shall extend the 
analysis to the case where the average production of the factories and 
the variance of the production among the factories is to be estimated. 
We shall describe the results of the analysis of both situations (a) and 
(b). 

Some comparisons are made between the minimum variance un- 
biased estimator and the “closest” estimator discussed by R. C. Geary 
[2]. The asymptotic efficiency of Geary’s estimator is found to be 0.9. 

A numerical example is presented. Also, a test of randomness, which 
is suggested by the problem discussed herein, is applied to tables of 
“random permutations” [3, pp. 414-40]. 

Statistically, the problem is that of sampling without replacement 
from a discrete, finite, uniform (i.e., rectangular) population. Related 
cases are sampling with replacement from a discrete (finite or infinite) 
uniform population [4] and sampling from a continuous uniform popu- 
lation [2 and 5]. 


2. INITIAL NUMBER KNOWN 


Consider first the case where the initial number s+1 is known to be 
1. That is, consider a random sample of n integers from the first p in- 
tegers. By random we mean that all p!/[n!(p—n) !] subsets of the n inte- 
gers have equal probability of constituting the sample. The problem is 
to estimate p. 

In Section 4, it is shown that the unbiased estimator of p with mini- 


e- 
of 
C- 
1s 
ie 
e 
j 
i 


624 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952 


mum variance is e=g(n+1)/n—1, where g is the largest integer in the 
sample. The variance of the estimator e is 


o(e) = (p+ 1)(p — n)/(n + 2)n. 
s*(e) = g(g — n)/n? 


is the minimum-variance unbiased estimator of o?(e). Furthermore, 


o*[s*(e)] = [40%(e) + 1]/n(n + 4), 
s*[s*(e)] = s*(e) [49%(e) + 1]/(n + 2), 


is the minimum-variance unbiased estimator of o°[s?(e)]. 

Suppose we wish to test the null hypothesis p= po against the alter- 
native hypothesis p< po. Then the region of rejection for a significance 
test at level a is obviously g>c; where ¢; is the largest integer satisfy- 
ing 


Also, 


and 


Pr {g Sa , po} = a”/po™ S1—a, 


and c™=c!/(c—m)!. If we wish to test the null hypothesis p=po 
against the alternative hypothesis p< po, then the rejection region for 
a significance test at level a is gSce, where cz is the smallest integer 
satisfying 

Pr {g po} = = a. 


The unbiased test at level a of the null hypothesis p= po against the 
two-sided alternative pp» is defined by the acceptance region 
C2<g Spo. Note that this test is unbiased, while a two-sided test at the 
2a level based on the region e2<g<ci, is biased. 

The results of the preceding paragraph may now be used to obtain 
confidence intervals. That is, the left-sided 1—a confidence interval is 
pad, where is the smallest integer satisfying S1—a. The 
right-sided 1—a confidence interval is p<d,, where d2 is the largest in- 
teger satisfying (g—1)™/d.™ =a. The two-sided 1—a confidence inter- 
val is 


Now consider the case where we are interested in estimating the pro 
duction p; of factory 7 ({=1, 2,3, - - - , N) from the n, serial numbers 


obtained from factory 7. Clearly what has been said holds in this case 
when we substitute n, for n, p; for p, gi for g, and e; for e. If we are con; 
cerned with 
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and no assumptions are made about the relation between the p,;, then 


é= > e/N 


is the minimum-variance unbiased estimator of J. Since the variance 
of é is 


0°(@) = 2, o°(e:)/N%, 
we define 


N 


= 


t=] 
and find that s(2) is the minimum-variance unbiased estimator of 
o°(é). Also, 

N 


o*[s*(@)] = o*[s%(e:) 


tm] 


so we define 
s*[s(@)] = ]/N4, 


which is the minimum-variance unbiased estimator of o°[s?(é)]. 
li we are interested in the variance of the production among fac- 
tories 


o%(p) = (pi — — 1), 


o°(p) = E{s%(e)} — NE{s*(2)}, 


N 
s*(e) = — — 1). 
imi 


Hence, 
8*(p) = s*(e) — Ns*(2) 


is the minimum-variance unbiased estimator of o?(p). 

Consider the intuitively reasonable estimate r of production p ob- 
tained by adding the average gap in the sample of n serial numbers to 
the largest serial number, where a gap is defined as the number of miss- 
ing integers between two successive serial numbers obtained. Since the 
average gap in the sample is simply d/(n—1)—1, where d is the differ- 
ence between the largest and smallest serial numbers in the sample, 


952 
he 
N 
ce 
| 
Do 
— 
2 
nh N 
n then 
where 
| 
i 


626 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952 
r=g+d/(n—1)-11! 
It can be seen that the joint distribution of g and d is 
Pr {g,d| n, p} = nd — 1)°-/p™, 


for d=n—1, n, n+1,---,g—1, g=d+1, d+2, ---, p, and that the 
distribution of d is 


Pr {d| n, p} = nd — — d)/p™. 
We also have that 
E{d} = (p+ — 
E{r} =p. 
The variance of this estimator is 
o*(r) = n(p — n)(p + 1)/(n — + 1)(n 2) 
and, therefore, the efficiency of r is 
o%(e)/o2(r) = 1 — n-, 
The minimum-variance unbiased estimator of o°(r) is 
s*(r) = s(e)/(1 — n-’). 
Hence, the minimum-variance unbiased estimator of 
o*[s*(r)] = o°[s%(e) ]/(1 — n-*)? 
s*[s*(r)] = s*[s*(e) ]/(1 — n-*)*. 


The results discussed in the previous analysis of the production of 
several factories can be extended now to the case where r; is used in- 
stead of e. 

The general case where the initial serial number is a known valu 
s+1 can be reduced to the previous analysis by subtracting s from al 
serial numbers. 


and, hence, 


3. INITIAL NUMBER UNKNOWN 


Now consider a random sample of n integers drawn from the integers 
s+1,s+2,s+3, ---,8s+p. The problem is to estimate p when s and p 
are unknown. 


1 If the definition of “gap” is modified to include also the number of missing integers before the 
smallest serial number obtained (another gap—between the smallest value and zero—is added to the 
n—1 already considered), then the statistic obtained by adding the average “gap” to the largest seria! 
number is ¢; i.e., if “gap” is appropriately defined, Ruggles and Brodie have obtained minimum-variance 
unbiased estimates of production. 
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In the case when the initial number is known, 
E{d} = (p+ 


where d is the difference between the largest and smallest serial num- 
bers obtained. This result still holds when the initial number is un- 


known, hence, 
f=dn+1)/m-1)-1 


is an unbiased estimator of p. For this case, f is the minimum-variance 
unbiased estimator of p. Also, the variance of f is 


o°(f) = 2(p — n)(p + 1)/(n — 1)(n + 2), 
and if we let 
3°(f) = 2(d — n + 1)d(n + 1)/n(n — 1)?, 


then s*(f) is the minimum-variance unbiased estimator of o?(f). 
The discussion of estimating # and o?(p) when the initial number is 
known may now be repeated, with e replaced by f. 


4. SOME PROOFS 


We shall indicate how the formulas presented in the preceding sec- 
tions were obtained by giving a proof that the estimator e is the mini- 
mum-variance unbiased estimator of p when the initial number is 
known, and by then computing the variance of e. All the other results 
presented may be proved by the same methods. 

The following relation A may be used repeatedly to obtain the for- 
mulas presented in the preceding sections: 


t tte (t+ + 
; = + = — 
m+k+1 


This relation may be proved by mathematical induction on ¢. 

In our notation, the number of ways of selecting a sample of 7 inte- 
gers from the first p integers is p™/n!. Similarly, the number of ways 
of selecting n—1 integers from the first g—1 integers is (g—1)*-”/ 
(n—1)!. Hence, the probability that g will be the largest integer in a 
sample of size n is given by the equation 


p™/n!} 


Pr {g| , p} = — 


We have that g is a sufficient statistic for p. Using Blackwell’s theorem 
[6], we see that given any unbiased estimate t, one can obtain an un- 
biased sufficient estimate (a function of g) whose variance is at least as 


| 
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small as ¢t. Hence, we need concern ourselves only with estimates which: 
are functions of g. 

We shall now show that there is only one function e(g) which is an 
unbiased sufficient estimate. For a given sample size n, the possible 
values of the parameter are p=n, n+1, n+2,---. Also, the possible 
values of the greatest number obtained in the sample are g=n, n+1, 
n+2,---.If p=n, then g=n with probability 1. Hence, in order that 
the statistic e(g), which is based only on g, be an unbiased estimator of 
p, it is necessary that e(g) =n when g=n. That is, 


E{e(g)| p =n} = e(n) =n. 
Now if p=n+1, we have that g=n and g=n-+1 are the only results 
which have a positive probability of occurrence. That is, if e(g) is to be 
unbiased for all values of p, 
E{e(g)| p=n+1} = e(n) Pr {g=n| p=n+1} 
+e(n+1) Pr {g=n+1| p=n+1} 
=nPr {g=n| p=n+1} 
+e(n+1) Pr {g=n+1| p=n+1} 
=n+1. 
Since the values of the probabilities may be computed, «» . equation 
then may be rewritten as 
e(n+1) =[n+1—n Pr {g=n| p=n+1}]/Pr {g=n+1| p=n+1} 
which specifies the value that e(n+1) must equal in order that e(g) 
be unbiased. Similarly, we find when p=n+2, 
e(n+2) = [n+2—n Pr {g=n| p=n+2} 

—e(n-+1) Pr {g=n+1| p=n+2}]/Pr {g=n+2| p=n+2} 
and since all the numbers on the right side of the equation may be com- 
puted, e(n-+2) may be computed, and hence is unique. More generally, 
we see that if e(g) is unbiased, 


n—1 


a(t) = Pr fo = p= A} fo = v= 


i=n 


forh=n,n+1, - - -. The recursion relation for the values of e(h) deter- 
mines e(h) uniquely, and hence e(g) is the only unbiased estimate of p 
which is based on g only. 

We have, using relation A, 


E{g| n, p} = Sg — = 
= (p+ + 1)p™ = (p+ 1)n/(n + 
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Hence, the expected value of 
e=g(n+1)/n-1 


is equal to p. Since we have only one unbiased estimate e(g) based on g, 
we see that e must equal e(g) and that this must be the estimator whose 
variance is at least as small as any other unbiased estimator (which 
may possibly be based on the entire sample a, a2, - - -, Gn).? 

Now let us compute the variance of g. We have 


+ ye — 


= [(p + 2)+)/(n + 2) — (p +1) /(n + 1) n/p 
= [(p + 2)/(n + 2) — 1/(n + 1]}(p + 1). 


Hence, 
o°(g) = E{g?| n, p} — [E{g| n, p}]? 
= [(p + In/(n + [(p + 4+ 4+ 2) 1 
— (p+ 1)n/(n + 1)] 
= [(p + 1)n/(n + 1)][(p — n)/(m + 1)] 


and, therefore, 


o(e) = (p+ 1)(p — n)/(n + 2)n. 
5. A FURTHER ILLUSTRATION: “CARS IN A TOWN” 


R. C. Geary [2] states: “At a recent meeting of the Dublin University 
Mathematical Society, E. Schrédinger suggested the following ingeni- 
ous problem as an illustration of Pitman’s (see [8]) concept of ‘close- 
ness.’ In a town, cars are known to be numbered consecutively from 1. 
The numbers on r of the cars are noted: the problem is to find the clos- 
est estimate of the number of cars in the town. . . . Let n, the unknown 
total number, be assumed so large that variation is continuous, i.e., 
that any car number observed at random has a rectangular frequency 
distribution. ... ” 

Having given two estimates X and Y of an unknown parameter 0, 
KE. J. G. Pitman [8] has suggested that X should be regarded as a 
“closer” estimate of @ if the probability that | X—0| <| Y—6| is greater 


? This result may also be obtained using the notion of a complete sufficient statistic [7]. 
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than 1/2. Geary proceeds to show that the statistic b =2"g is the “clos- 
est” estimate of n. For a broad class of problems to which this one be- 
longs, the median of the “closest estimator” is equal to the parameter, 
Using the notation of the preceding sections, the “closest” estimate of 
p is 

First consider the model suggested by Geary where variation is con- 
tinuous (a continuous uniform distribution). Here, we have that the 
minimum-variance unbiased estimator of p is 


c =g(n + 1)/n. 


When 6 differs from c, we find that b underestimates on the average for 
any sample size n. 
We have as the values of the mean squared error of the statistics } 
and c 
MSE(b) = E{(b — p)*} 
= p*{[1 — + 1)]? + + 2)(n + 


and 
MSE(c) = E{(c — p)?} = p?/n(n + 2), respectively. 
Writing 
MSE(c) 
1/n(n + 2){ [1 — + 1)}? 


+ 4!=n/(n + 2)(n + 1)?}, 


we may, for any particular value of n, decide which statistic, b or c, 
should be used, by computing the value of FZ. As n becomes large, five 
find that EF approaches 


E., = 1/{[1 — log 2}? + 1} 
= 0.9. 


Hence for large samples the “closest” estimate is not quite as efficient 
as the minimum-variance unbiased estimator. 

A more detailed comparison of estimators for a continuous uniform 
distribution appears in [9]. 

Now consider our model where the initial number is known, ‘rather 
than the case where variation is continuous. The analogue of the statis- 
tic b is then that value of d for which 


= dm, 


We shall consider b as a first approximation, and compare b with 
e=g(n+1)/n—1. 


| 
| 
©. 
; 
| 
3] 
i 
i 
: 
’ 
4 
| 
\ 
j 
: j 
‘ 
4 
. 


R 1952 


clos- 
be- 
eter, 
te of 


con- 
the 


for 


es b 


SERIAL NUMBER ANALYSIS 631 
When the initial number is known, the mean squared errors for b and 
e are 
MSE(b) = {[p — + 1)n/(n + 
+ + 1)(p — n)n/(n + 2)(n + 1)%} 


MSE(e) = (p + 1)(p — n)/n(n + 2), respectively. 


For a fixed value of n, MSE(b) and MSE(e) are quadratic functions 
of p. Hence also D=MSE(b) —MSE(e) is a quadratic function of p. 
By computing the roots of D we are able to find the values of p for 
which e is a more efficient estimator than b, for a particular sample 


size n. 


Letting p=kn(k> 1), we find that as n becomes large, D approaches 
D.. = k?(1 — log 2). 


Hence when our model with the initial number known is used for 
large samples, the statistic b is not quite as efficient as the minimum- 
variance unbiased estimator. 


6. A TEST FOR RANDOMNESS AND A NUMERICAL EXAMPLE? 


The results presented here might be used to devise some tests for 
randomness. For example, we might test whether the greatest integer 
among the first three numbers in a random permutation of the integers 
1 to 9, obtained from a table of random permutations, has the following 


distribution: 
Pr {g} = (@ — 1) — 2)/168. 


Samples of forty permutations were obtained from a table of random 
permutations of the integers 1 to 9 [3, Table 15.6]. The first three 
numbers in a random permutation of the first nine integers may be 
considered a random sample of three serial numbers from a population 
for which p=9. We may perform a x? test on the distribution of the 
greatest integer among the three. Table 1 contains the distribution of 
the greatest integer among the first three numbers for a sample of 
forty permutations drawn at random from the table. 

The value for x? obtained for Table 1 is 1.79 which has a probability 
value of P=.77 (4 degrees of freedom); i.e., P=Pr{x*>1.79} =.77. A 
similar test was performed on the first forty permutations appearing in 
the table and again on the second forty permutations appearing in 
the table [3, p. 422]. The values of x? with their corresponding P 
values for the samples of forty permutations are presented in Table 2. 


* The author is indebted to John Gilbert and Thomas King who performed the computations pre- 
sented in this section. 
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TABLE 1 
DISTRIBUTION OF THE GREATEST INTEGER AMONG 


THE FIRST THREE NUMBERS IN 40 RANDOM PER- 
MUTATIONS OF THE INTEGERS 1 TO 9 


Greatest Integer 3-5 6 7 8 9 

Theoretical 4.4 4.8 7.1 10.0 13.3 

Actual 3 6 6 9 16 
TABLE 2 


VALUES OF x? AND THEIR PROBABILITY VALUES P FOR THE 
DISTRIBUTION OF THE \;REATEST INTEGER AMONG THE 
FIRST THREE NUM: :RS IN SAMPLES OF 40 PER- 
MUTATIONS OF THE INTEGERS 1 TO 9 


Sample Drawn Sample of Sample of 
at Random First Forty Second Forty 
x? 1.79 2.59 3.36 
Probability values mi -61 .50 


Table 3 contains the distribution of the greatest integer among 
the first four numbers in forty permutations of the integers 1 to 16 
which were drawn at random from the table [3, Table 15.7]. The first 
four integers may be considered a sample of four serial numbers from 


TABLE 3 


DISTRIBUTION OF THE GREATEST INTEGER AMONG THE 
FIRST FOUR NUMBERS IN 40 RANDOM PERMUTA- 
TIONS OF THE INTEGERS 1 TO 16 


Greatest Integer | 4-10 | 11-12 | 13 4 15 16 
Theoretical 4.6 6.3 4.8 6.3 8.0 | 10.0 
Actual Fj 3 6 7 11 


a population where p=16. The value of x? (5 degrees of freedom) o}- 
tained for Table 3 is 3.45 which has a probability value of .62. A 
similar test was performed on the first forty permutations appearing 
in the table and again on the second forty in the table [3, p. 428]. 
The values of x? with their corresponding probability values for the 
samples of forty permutations are presented in Table 4. 


We find that the sum of the values of x? for the six samples (Tables 
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VALUES OF x? AND THEIR PROBABILITY VALUES P FOR THE 
DISTRIBUTION OF THE GREATEST INTEGER AMONG THE 
FIRST FOUR NUMBERS IN SAMPLES OF 40 PERMUTA- 


TIONS OF THE INTEGERS 1 TO 16 


Sample Drawn Sample of Sample of 
at Random First Forty Second Forty 
x 3.45 11.43 7.85 
Probability values .62 -04 -16 


2 and 4) is x*=30.47 (4K3+5X3=27 degrees of freedom) which has 
a probability value of .29. The results of this analysis do not disagree 
with results of the tests of randomness applied by Cochran and Cox 


[3, 15.4]. 


The work presented in the preceding paragraphs could have been 
considered an empirical check of the mathematically derived results 
as well as a test of the adequacy of a particular table as a source of 
random numbers. The samples could also be used to illustrate the 
methods presented in the preceding sections. In Tables 5 and 6 we 
have presented the analysis of two sets of forty samples of three and 


TABLE 5 


ANALYSIS OF 40 SAMPLES OF 3 SERIAL NUMBERS 


FROM THE FIRST 9 INTEGERS 


Expected 


Mean Squared 


Estimate Value of —— of 40 Error of 40 
Estimate stimates Estimates 
e 9.0 9.3 3.2 
r 9.0 9.2 3.5 
s?(e) 
Variance of e 4.0 4.2 3.0 
8*(r) 
Variance of r 4.5 4.8 3.8 
8*(s*(e)) 
Variance of s*(e) 3.2 3.5 5.0 
8*(s*(r)) 
Vafiance of s*(r) 4.1 4.5 8.1 
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TABLE 6 


ANALYSIS OF 40 SAMPLES OF 4 SERIAL NUMBERS 
FROM THE FIRST 16 INTEGERS 


Expected = Mean Squared 
Estimate Value of —— of 40 Error of 40 
Estimate stimates Estimates 
e 16.0 15.9 9.1 
r 16.0 15.9 9.8 
s*(e) 
Variance of e 8.5 8.4 10.2 
8*(r) 
Variance of r 9.3 9.0 11.6 
8*(s*(e)) 
Variance of s*(e) 9.1 9.3 30.8 
8*(s*(r)) 
Variance of s?(r) 10.5 10.6 39.9 


four serial numbers, respectively. Note the relation between the mean 
squared errors of e and of r in both samples. 
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CONFIDENCE INTERVALS FOR MEDIANS AND 
OTHER POSITION MEASURES* 


Rap 8S. WooprurF 
Bureau of the Census 


HE common text-book measures of reliability of medians, quartiles, 
deciles, and other position measures depend on the assumption 


' that the distribution to which they are applied is of a particular type 


(such as normal). This is usually not the case and a considerable error 
can result from this incorrect assumption. This article describes a 
method of obtaining confidence intervals for medians and other posi- 
tion measures using a principle that has been applied to simple random 
sampling and extending it to any type of sampling.’ This method does 
not depend on the assumption that the distribution is normal or any 
other special type. All proofs and illustrations will be in terms of 
medians but the principles apply equally to any other position measure. 


I. METHOD OF ESTIMATING CONFIDENCE LIMITS FOR MEDIANS 


Let us assume that a sample median is determined by the following 
process: A sample is drawn from the population and a reflection of the 
population (hereafter referred to as the pseudo-population) is obtained 
by weighting each sample item by its proper (not necessarily equal) 
weight. This pseudo-population is arrayed in order of size for some 
characteristic (x) and the middle item or the average of the two middle 
items, is selected as the sample median (6@’). 

Confidence limits for the median are obtained in the following steps: 
(1) The standard deviation of the percentage of items in the pseudo- 
population which are less than @ (the true median) is estimated. In 
practice, since @ is not known, the standard deviation of the percentage 
of items less than 6’ (the sample median) is substituted.” (2) The value 
determined in step (1) is added to and subtracted from 50%. The value 


* The method of calculating confidence limits for medians described in this paper has been used in 
the Bureau of the Census for a number of years, having been used first in the survey of radio listening 
habits conducted by the Bureau for the Federal Communications Commission in 1945, The author, 
aided by suggestions from other members of the sampling staff of the Bureau of the Census, has de- 
veloped the rationale for the method. 

1 It is required only that the sample be a probability sample, i.e., that the probability of each item 
in the population coming into the sample be known. Note that the sampling procedures can involve 
any combination of such devices as stratification (with or without proportional allocation), cluster 
sampling with clusters of equal or unequal sizes, multi-stage sampling, etc. 

2 The problem of estimating the variance of the percentage of items less than a stated value is a 
special case of the problem of estimating variances of means. This problem has been adequately treated 
in the literature and it is not the purpose of this paper to discuss it. Rather the paper shows how to trans- 
late estimates of variance of this particular type of mean into confidence limits for the median. 
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of x corresponding to the 50+¢ percentage points are then read off the 
pseudo-population arrayed in order of size. The resulting values con- 
stitute one standard deviation confidence limits for the median. 2, 3, 
or K standard deviation confidence limits are calculated in a compara- 
ble fashion (See Exhibit I for an actual example of the computation 
of median confidence intervals and Exhibit IV for a brief outline of the 
method of applying the principles to other quantiles.) 


Ii. THEORETICAL JUSTIFICATION FOR METHOD OF CALCULATING 
CONFIDENCE INTERVALS FOR MEDIANS 


The basic principle behind the suggested approach can best be ex- 
plained graphically. Let us assume that a given population is arrayed 
by size. The point 6 (the true median) divides the population into two 


100 


o 


@ 


GUMULATIVE PERCENTAGE FREQUENCY 


rn? x VALUE OF ITEM 
Figure 1. 


equal parts.* Let us assume further that a sample is drawn from this 
population and by proper weighting of each sample item a reflection 
of the original population is obtained and ordered by size. This can be 
illustrated graphically by the following cumulative frequency curves. 


3 Note that this condition is not met where more than a negligible proportion of the population has 
a value exactly equal to 0. Where there is a possibility that an appreciable proportion in the population 
possesses & value exactly equal to 0, interpolation of the type described in Section III-D will produce a 
derived frequency distribution where the probability of values exactly equal to @ is negligible. 
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In Figure 1 the solid line is the cumulative frequency curve of the 
population while the dashed line represents the cumulative frequency 
curve derived from the pseude-population. 

Now let us draw two arbitrary horizontal lines across the graph— 
one lower than the 50 per cent point of the Y-axis (we shall call this 
A) and one higher than the 50 per cent point of this axis (we shall 
call this B). Let x4 and xg be the Ath and Bth items of the pseudo- 
population (graphically the z-values of the intersections of the cumu- 
lative frequency curve of the pseudo-population with A and B). Let 
p be the percentage of items in the pseudo-population less than the 
true median (graphically the Y-value of the intersection of the cumu- 
lative frequency curve of the pseudo-population with @). Then, since 
these curves are non-decreasing, @ is within the limits x4 and zz, if 
and only if p is within the limits A and B. 

Symbolically this can be expressed as 


Pr <0 < 2g) = Pr (A <p < B).45 


It is immediately apparent that this relationship greatly simplifies 
the problem of obtaining confidence limits for the median. It is very 
difficult to determine either exact or approximate probabilities directly 
for the type of interval shown on the left-hand side of the above 
equality. However, it is relatively easy to make statements about 
probabilities for intervals such as those on the right-hand side of the 
equation. In general, we can estimate the variance of the percentage 
of sample items less than the true median (¢,”). If we then choose our 
arbitrary limits A and B so that A=50%—Ko, and B=50%+Keo, 
(K being any positive number), we can make the same statements 
about the probability of including the true median in the derived 
median confidence intervals x4 and xg as can be made about the prob- 
ability of p falling within the limits of A and B. If for example the 
distribution of p is near enough normal that we can state that p will 
fall within the two standard deviation limits A and B approximately 
95 per cent of the time, then we can state that the derived confidence 


‘It is to be noted that the cumulative frequency curve derived from the pseud tion will 
not be a continuous curve but will have discrete jumps if interpolation is not used. As for the continuous 
curve, there will be unique values z4 and zg corresponding to the limits A and B except in the unlikely 
event that a horizontal section of the step diagram coincides exactly with A or B. In this case, the highest 
valued z4 and lowest valued zg should be taken if the Pr(z4 <@ <zg) is to equal Pr(A <p <B). 

5 This principle as applied to simple random sampling has been previously stated in the literature. 
For example, see “On confidence ranges for the median and other expectation distributions for popula- 

“sns of unknown distribution form” by W. R. Thompson, Annals of Mathematical Statistics, 7(1936), 
1. }-28; “Order statistics” by S. 8. Wilks, Bulletin of the American Mathematical Society, 54 (1948), 14; 
and A. M. Mood, Introduction to the Theory of Statistics (McGraw-Hill, 1950), 388-89. 
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intervals r4 and 2g will include the true median approximately 95 
per cent of the time.® 
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III. ESTIMATING THE VARIANCE OF THE PERCENTAGE OF SAMPLE 
ITEMS LESS THAN THE TRUE MEDIAN 


It is apparent from the above relationships that the derived con- 
fidence intervals for the median depend on the estimate of the variance 
_of the percentage of items in the pseudo-population less than the true 
median (c,”). As has been pointed out ¢,? is in reality the variance 
of a particular type of mean and the general problem of estimates of 
-variance for means will not be discussed in this paper. However, 
there are problems in estimating o,? which are peculiar to its use in 
this case. These discussed. 


A. Estimation of or where simple random sampling is used. 


The problem is simplest when the n items selected in the sample 
are chosen with a simple random sampling process. A solution to the 
problem in this case has already been suggested.’ If the sampling is 
done with replacement or from a very large population the variance 
of the sample number less than @ is n PQ where P=Q=.5.° The one 
standard deviation limits are (n/2) ++/n/2. To secure the correspond- 
ing one standard deviation confidence limits for the median it is 
necessary only to find the value of the (n++/n)/2 and the (n—+/n)/2 
items of the sample (arrayed in order of size). 

If simple random sampling is done without replacement from a 
finite population of N items, the variance of the number less than @ is 
[(N—n)/(N—1)] nPQ where P=Q=.5. These results, of course, are 
known and are exact even though only sample results are available. 

If a more complex design than simple random sampling is used and 
it is known that the design is less efficient than simple random sam- 
pling, then the n/4 approximation will give a quick lower bound to the 
variance oc,” and the corresponding median confidence intervals. Simi- 
larly if the design is more efficient than simple random sampling the 


* It is to be noted that in many practical sampling problems p will be based on a large sample. 
Under these circumstances its distribution may be near normal even though the distribution of indi- 
vidual items in the population is far from normal. 

7 See references previously cited. 

8 In this case a more exact estimate of Pr(A <p<B) could be obtained if desired by summing 
terms of the binomial expansion (either directly or from tables of incomplete beta function). Using this 
approach, K. R. Nair has computed the symmetric limits A and B for which Pr(z4 <6 <z3)=>.95 and 
also >.99 for n =6, 7, 8,* ++ . 81. (“Table of confidence intervals for the medians in samples from any 
continuous population,” Sankhyd, 4 (1940), pp. 543-50). 
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n/4 approximation will give a quick upper bound to the variance 


B. Substitution of a sample estimate of the median (0’) for the true 
median (6) in the calculation of o,’. 


When systems of sampling other than simple random sampling are 
used, it is not possible to obtain exact estimates of the variance of the 
number of sample items less than @ from sample data. However ap- 
proximations of this variance can be obtained from sample data. For 
example, if stratified sampling is used and n; of N; items are taken at 
random from each of R strata, an unbiased estimate of the desired 
variance would be 


— ni)(ni) 
Nun,(n; — 1) 


where p; is the proportion of sample items in the 7th stratum less than 
6. Of course if only a sample is available, 6 is not known and p; cannot 
be determined. However, we can determine the proportion of sample 
items in the 7th stratum less than 6’ (the sample median) and use this 
value (which we shall designate as p,’) in the above equation. It is 
difficult to determine the expected value of such an estimate of the 
variance. However, it is evident that other things being equal, the 
larger the number of strata and the larger the number of items taken 
from each stratum, the more stable the estimate of 6’ becomes and the 
less effect the substituting of 0’ for @ in the variance formula will have. 
In this connection some empirical computations have been made with 
a theoretical population (Exhibit II). This population consists of 60 
items ranked from 1 to 60. It is similar to many populations found in 
practice in that there is a substantial difference between strata but 
still a substantial variance within strata. However, the number of 
strata and size of sample are smaller than those usually found in prac- 
tice. The population variance for a sample of 30 items (six items from 
each of 5 strata) has been calculated and compared (1) with the sample 
variance assuming @ is known and (2) with the sample variance using 
the estimate 6’. In general, by averaging 30 different empirical results 
it appeared (1) that the bias encountered using @’ is small, (2) that the 
correlation between the two variance estimates (using @ and 6’) is 
large, and (3) that the mean-square-errors of the two variance es- 
timates are very similar. (As a matter of fact, the mean-square-error 
of the estimate using 6’ was slightly smaller.) 
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A similar test was made with a small but otherwise typical cluster 
sample (see Exhibit III). The conclusions are similar except that in this 
case the empirical mean-square-error of the variance estimate using 6’ 
is slightly larger than that using 0. 

It would appear then that the variance estimate which employs an 
estimate of 6 may be as acceptable as one using the true 6 at least under 
conditions where a reasonably stable estimate of @ is obtained. 


C. Estimation of o,? where the number of items in the sample is variable. 


In many sampling designs found in practice n’ (the total number of 
items in the pseudo-population) is a variable. For example, if the 
design employs clusters which are unequal in size in terms of the unit 
for which the median is calculated, n’ will be variable. In this case, 
o,? becomes the variance of the ratio x’/n’ where 2’ is the unbiased 
estimate of the number of items less than @ and n’ is the unbiased 
estimate of the total number of items. Using the usual approximation 
to this variance we have o,?=(.5)?[Ve2+Vn?—2Van] where V,? 
and V,,? are the coefficients of variation squared for x’ and n’ respec- 
tively and V., is the covariance between z’ and n’ divided by 
x’ and n’. 


D. Estimation of o,? where frequency distributions are used. 


Another problem arising in practice is that the pseudo-population is 
often considered as a frequency distribution instead of being considered 
in all its detail. In this case, the median is commonly estimated from 
the pseudo-population with the formula 


where 


K =lower limit of median class, 
N=number of items in pseudo-population (can be a constant or 
variable as pointed out), 
c’ =total frequency in the pseudo-population up to median class, 
v’=number of items in the pseudo-population in median class, 
«=class interval. 


If we assume that the “true” median is estimated in a similar fashion 
the formulation for its calculation would be 


A 
4 | 
ae 
3 
| 
4 
| 
| 
4 
Sing 
N 
2 
4 
a 
aq 
i 


on 


CONFIDENCE INTERVALS FOR MEDIANS 641 


N 
——C 
K+ 
V 
where C and V are the population totals of the corresponding totals 


from the pseudo-population in the above formula. If we represent 


2 
V 
(the amount of interpolation in the median class) by the symbol F 
then p would be p:+/'p2 where p; is the percentage of items in the 
pseudo-population having a value from 0 to K (up to the median 
class) and pe is the percentage of items in the pseudo-population 
having a value from K to K+ (in the median class). If we let o;? 
represent the variance of p;, 02? represent the variance of pz, and oj2 
represent the covariance between p; and pz, en ¢,” would equal 
032. 
The above variance can be approximated from the sample by as- 


suming that the median class indicated by the sample is the true median 


class, by substituting the sample value of F for the true F and by 
estimating 1’, o:?, and ow from the sample using variance formulas 
appropriate for the design used. This should be about as satisfactory 
as the ordinary approximation of a variance from a sample provided 
that the sample is sufficiently large that the median class and F are 
determined with reasonable accuracy. 

If the net value of F?o.?+2Fo. is small relative to o;’, then o;? can 
be substituted for the full variance of o,?. In the attached illustration 
(Exhibit I) the variances and the corresponding median confidence 
limits were estimated using both o;? and o;?+ Fo.?+2F 012 as estimates 
of o,?. As data were published (to the nearest tenth) results were iden- 
tical. In general the size of o2* and oy: relative to o:? should decrease as 
the median class becomes relatively smaller. In other words, the o;’ 
approximation should work best where no one class contains a large 
proportion of the total number of items. In the illustration the median 
class had a large proportion of the total number of items (26%). 

If o;? is substituted as an estimate of the variance of the percentage 
less than 6, it should be estimated through that class whose upper 
limit is nearest 50% (in other words the absolute value of F should be 
minimized). For example, consider the following cumulative distribu- 
tion: 
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Percentage of radio households 
No. stations hearing this number of 
stations or less 

10 

20 

35 

55 

70 

100 


ar WON 


For the above distribution, o;?, if used alone, should be the variance 
of the percentage of radio households hearing three stations or less 
instead of the percentage of radio households hearing two stations or 
less. 


IV. CONCLUSION 


In conclusion, it appears that confidence limits for medians and 
other quantiles can be approximated for any sampling design where 
the variance of the percentage of items less than a stated value can be 
acceptably estimated (in general, where large samples are involved). 


EXHIBIT I 


EXAMPLE OF PROPOSED METHOD OF COMPUTATION OF MEDIAN 
CONFIDENCE INTERVALS 


An example which can be used to illustrate the computation of con- 
fidence limits for the median is furnished by the survey of radio 
listening habits which was conducted in 1945 by the Bureau of the 
Census for the Federal Communications Commission. One of the 
tables published as a result of this survey was a frequency distribution 
of the percentage of households with working radios hearing 0, 1, 2, 
etc., Class IA and IB (clear channel) stations at night. 

The distribution resulting from the pseudo-population was as fol- 


lows: 
Percentage of all 
No. of stations radio households Cumulative 
heard hearing exactly percentage 
stated number 
0 14 14 
1 30 44 
2 26 70 
3 19 89 
4 8 97 
5 2 


® Does not add to 100 because of rounding in individual classes. 
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The median was calculated from the formula 1.5+ (6/26) (1.0) =1.7 
(using the straight-line interpolation within the median class previ- 
ously described). 

Since the median was calculated from a frequency distribution using 
straight-line interpolation in the median class, the formula o;?+ Fc? 
+2Fo,2 is applicable for the variance of the percentage of households 
hearing fewer than 6@ stations. Since we do not know the exact value of 
o;*, 022, F, and ow, we must substitute estimates derived from our 
sample. In our sample results, the sample median is included in the 
2 station class. Therefore o;? is the variance of the percentage hearing 
! or fewer stations (since the sample was an area sample, the number 
of radio households is also variable, therefore the appropriate variance 
is the variance of the ratio x;'/n’ where 2,’ is the unbiased estimate of 
the number of radio households hearing 0 or 1 stations and n’ is the 
unbiased estimate of the number of radio households). An estimate of F 
is the amount of interpolation in the median-class (6%/26%). o2? is 
the variance of the percentage of radio households hearing exactly 2 
stations and oy. is the covariance between the percentage of radio 
households hearing 0 or 1 station and the percentage of radio house- 
holds hearing exactly 2 stations. As estimated from the sample these 
values were 


.0007070, 


.0001708, 
ox = — .0001323, 
F= .2308, 
oy? + + 2Four = .0006550, 
+ + = .02560. 


Since the expected value of the percentage of radio households 
hearing @ or fewer stations is 50%, the 1c limits are 47.44%-52.56%, 
the 2c limits are 44.88%-55.12%, and the 3e limits are 42.32%- 
57.68%. 

By reading these values off the pseudo-population in the same fashion 
that the median was obtained, we obtain for the 1o¢ median confidence 
limits 1.632-1.829, for the 2c confidence limits 1.534 and 1.928, and 
for the 3c centthans limits 1.444 and 2.026. 

As the median confidence intervals for this survey were we 
calculated, only the first term (0°) was used as an approximation of 
the variance of the percentage less than 9. Data as published (to the 
nearest tenth) were identical as can be seen from the following table. 


\ 

| 

nce 

less 

or 

ind 

pre 

be 

- 

n- 

lio 

he | 

he 

yn 

2, 


a4 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952 


As calculated As calculated with 
with +F%o,? +2F 
Estimated variance of per- 
centage less than 6 -0007070 -0006550 
Median confidence intervals 
le 1.628 —1.833 1.632 —1.829 
20 1.526 —1.935 1.534 —1.928 
3e 1.434 —2.038 1.444 —2.026 


EMPIRICAL EVIDENCE OF EFFECT OF SUBSTITUTING AN ESTIMATED 


EXHIBIT II 


MEDIAN (6’) FOR THE TRUE MEDIAN (0) IN ESTIMATES OF THE 
VARIANCE OF THE NUMBER OF SAMPLE ITEMS LESS THAN THE 


TRUE MEDIAN—FOR A STRATIFIED SAMPLE DESIGN 


POPULATION: It is assumed that there are 60 items which are ranked 
from 1 to 60. These are assumed to be arranged in the following 


strata. In the samples 6 items are drawn without replacement from 
each stratum. 


Stratum 1: 1, 3, 5, 7, 8, 9, 10, 
Stratum 2: 2, 4, 6, 11, 12, 16, 
Stratum 3: 17, 18, 22, 25, 26, 28, 34, 36, 
Stratum 4: 14, 20, 23, 27, 30, 33, 35, 45, 49, 
Stratum 5: 19, 24, 37, 38, 39, 44, 47, 54, 55, 57, 58 


3, 15, 46, 48, 
1 


1 
21, 29, 31, 32, 40, 43 


True 
variance 


(1) 


Averages from 30 samples 


Mean-square-error of 
sample variances 


Variances as calcu- 
lated from sample 


Correlation 
between two 
types variance 
estimates 


(6) 


with 6 


2) 


with 
(3) 


with @ 
(4) 


with 6’ 
(5) 


3.409 


3.383 


3.420 2063 - 1804 


-86 


True variance estimate form: 


ni(N. Ni) 


R 
(N; — 1) 


PQ. 


Sample variance estimate form where @ is assumed to be known 


Sample variance estimate form where 2’ is used 


tol 


N; 
N; 


n? 


1 Pelt. 
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REN: 1n,? 


N; ni—l 


In the above formulas there are R strata, N; and n; are the total 
number of items and the number of sample items from each stratum; 
P; and Q; are the percentages of items smaller and larger than 6 in 
each stratum; p; and q; are the percentages of sample items smaller 
and larger than 6 in each stratum; p,’ and q;’ are the percentages of 
items in each stratum smaller and larger than 6’. 


EXHIBIT III 


EMPIRICAL EVIDENCE OF EFFECT OF SUBSTITUTING AN ESTIMATED 
MEDIAN (6’) FOR THE TRUE MEDIAN (6) IN ESTIMATES OF THE 
VARIANCE OF THE NUMBER OF SAMPLE ITEMS LESS THAN THE 

TRUE MEDIAN—FOR A CLUSTER SAMPLE DESIGN 


POPULATION: It is assumed that there are 80 items which are ranked 
from 1 to 80. These are assumed to be arranged in the following 20 
clusters. 10 of the 20 clusters are drawn without replacement in the 
30 samples. 


i 


Cluster Item 
number ranks 


Cluster Item 
number ranks 


Cluster Item 
number ranks 


Cluster Item 
number ranks 


6 18, 19, 22, 37 
7 . 48, 49, 69, 71 
8 . 54, 73, 76, 78 
9 , 50, 32, 59, 72, 74 
10 17, 39, 40, 41 47, 55, 58, 77 


Averages from 30 samples 


Correlation 
between two 
types variance 


Mean-square-error of 
sample variances 


Variances as esti- 


variance mated from samples 


(1) 


with @ 
(2) 


with 6’ 
(3) 


with @ 
(4) 


with 6’ 
(5) 


estimates 


(6) 


16.845 


16.91 


16.37 


4.262 


4.910 


-66 


True variance estimate form: 


M 


(M — 1) M 


\ \ \ 
2 645 
j 
1 6,15, 20, 26 
2 23, 24,31 
3 4, 9,14,29 
4  7,10,12,13 
5 1, 3,28,36 
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Sample variance estimate form where @ is assumed to be known: 


(a — 


m(M m) i=l 
M 


Sample variance estimate form where 6’ is used 


m(M — m) > (ys — 5)? 


M 


M and m are the total number of clusters and the number of sample 
clusters respectively. 
X; is the number of items in the ith cluster less than @ while 7; is 
the number of items in the ith sample cluster less than @. 
y; is the number of items in the 7th sample cluster less than 6’. 
X is the average per cluster (2) while 


ton] 


m—1 


m-—1 


f= (the sample average per cluster). 


5 =-—— (the sample average per cluster evaluated using 6’) =2=X. 
m 


EXHIBIT IV 
METHOD OF CALCULATING CONFIDENCE LIMITS FOR OTHER QUANTILES 


To illustrate the application of these principles to other quantiles 
other than the median let us consider for example, confidence limits 
for the first quartile (A). 

This would proceed in the following steps: 

1. The standard deviation of the percentage of items in the pseudo- 
population less than \ would be estimated. Actually, the estimate 
of the variance of the percentage of items less than \’ (the quartile 
derived from the pseudo-population) would be substituted. 

2. The value determined in Step 1 is added to and subtracted from 
25%. The values of x corresponding to the 25+0 percentage 
points are then read off the pseudo-population arrayed in order 

of size. The resulting values constitute one standard deviation 
confidence limits for the first quartile. 
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PROBLEMS IN INTER-SPATIAL COMPARISONS OF 
WORKER EFFICIENCY: ILLUSTRATIONS FROM 
PUERTO RICO 


Stmon RotreNBERG 
University of Chicago 


HERE is already a considerable literature on international and 
comparison of labor productivity! and an occasional 
note recording a caveat against the use of the results of such compari- 
sons without great care.” 

These comparisons, like the inter-temporal variety, show physical 
or value outputs per worker or per worker-hour in the compared 
situations. Because labor productivity statistics give a relationship 
between input of labor and output of product, laymen often assume 
that they tell something about comparative worker skill and effort. 
Practiced handlers of productivity statistics know that this is not so. 

An index of manhour output expresses a complex of influences of 
which worker efficiency is only one. Frequently it is one with negligible 
effect upon the final statistical result. The other influences include the 
quantity and quality of capital inputs, the rate of capacity of operation, 
and a host of management factors such as the regularity of materials 
flow and the efficiency of supervision. 

Consider, for example, the quantity and character of the capital 
inputs with which labor is combined in production. A given commodity 
produced in capital-inte1. .ve ways will be produced at a high rate of 
output per unit of labor input; contrariwise, the same quantity pro- 
duced in labor-intensive ways will be produced at a low rate of output 
per unit of labor input. The way in which factors of production are 
proportionally combined will be determined by the relative prices of 
the factors, with entrepreneurs pursuing the lowest cost combination 
for a given output. Where labor is cheap, relative to capital, the ratio 
of marginal physical productivities will equal the ratios of factor prices 
and the least outlay combination will have been achieved, when much 


! See section on “Regional and International Comparisons,” p. 35 ff. in U. 8. Bureau of Labor 
Statistics, Selected References on Productivity, Washington, D. C., October 1946, mimeographed, for 
bibliography; L. Rostas, Comparative Productivity in British and American Industry, Cambridge, Eng- 
land, 1948; C. E. V. Leser, “Output per head in different parts of the United Kingdom.” Journal 
of the Royal Statistical Society, Series A 113 (1950), 207; “International Comparisons of Labor Produc- 
tivity ” Ch. VII of Methods of Labour Productivity Statistics, Geneva, International Labour Office, 
1951. 

2 Solomon Fabricant, “Of productivity statistics: An admonition,” Renew of Economics and Sta- 
tistics, November, 1949, p. 309 ff.; Solomon Fabricant, review of Rostas, op. cit., Review of Economics 
and Statistics, May, 1950, p. 188 ff. 
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labor is combined with little capital. Where labor is relatively expen- 
sive, the proportions will run in the other direction. 

We see, therefore, that if the capital-labor ratios are significantly 
different in the compared places, any inter-spatial comparison of labor 
productivity will tell us nothing about comparative worker efficiency. 
An example is provided by Puerto Rico in comparison with the United 
States. In Puerto Rico a dense population, with a high rate of natural 
increase in a dominantly-agricultural island, provides a relative abun- 
dance of labor which is available for wage-hire at very low prices and 
which will be self-employed at imputed prices which are even lower. 
Average earnings in manufacturing were forty-four cents per hour in 
August 1949, compared to $1.40 in the United States in the same 
month. (And the weight of evidence seems to be that the wage differ- 
ential between Puerto Rico and the United States has widened over 
the last fifteen years.) 

In these circumstances of cheap labor, labor-intensive techniques of 
production are everywhere practiced in Puerto Rico. As a result output 
per worker is low compared with the United States. The basic imple- 
ments of agriculture in Puerto Rico are the machete and hoe; the 
trade and service sector of the economy is characterized by ambulant 
_ vendors and very small shops which break packaged goods for small 
unit sales; almost half the manufacturing employment in the island 
is composed of homeworkers in the needlework industry who work 
with hand needles, casually, and without supervision. 

Manufacturing establishments are small and are, for the most part, 
in industries in which capital per worker is relatively low even in 
other, more developed, places. In these establishments, when choices 
are available with respect to factor combination, they are made in 
favor of more labor and less capital. That is to say, even in the low 
capital coefficient industries characteristic of Puerto Rico’s manufac- 
turing, techniques which require larger capital inputs are not used. 
Sewing machines in the needlework industry, for example, are fre- 
quently obsolete and physically depreciated. Cigars are made by hand 
with the use of only a knife and a board on which to operate. The 
hooked rug industry uses a simple frame to hold the rug foundation 
in place and a hook or needle operated by hand; electrically operated 
hooks which have been developed are not used in Puerto Rico. The 
grapefruit canning industry peels, segmentizes, washes, cans and juices 
its grapefruit by hand, rather than mechanically, as is done on the 
mainland. Beads are strung on the island by hand, whereas mainland 
establishments have developed bead stringing machines. Even in 
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those industries which simulate mainland practices with respect to the 
basic processes, materials and product handling is a hand process. 
The use of automatic recording and control devices is rare. 

We see, then, that labor efficiency comparisons, if they are to be 
made fruitfully at all, must be made in ways in which the capital 
coefficient will be held constant. Can this be done? If so, will the labor 
productivity statistics then show similarities or differences in inter- 
spatial worker efficiencies? 

An attempt was made to find the answer in a Puerto Rico-United 
States comparison for selected industries in which basic production 
techniques seemed to be similar. The industries selected for study 
were sugar refining, fertilizer mixing, cement, and power generation. 
United States productivity statistics were secured by reference to the 
published reports of the Bureau of Labor Statistics. For Puerto Rico, 
statistics were developed from company records. 

The findings were that for all industries, except power generation, 
output per manhour was not much different in Puerto Rico and in 
the United States. Manhours per ton of refined sugar were 6.68 in the | 
United States in 1948 and were 5.09 in Puerto Rico in 1949; manhours 
per 100 barrels of cement were 48.4 in the United States in 1945-46 
and 47.2 in Puerto Rico in 1948-49; manhours per ton of mixed fer- 
tilizer were 2.46 in the United States in 1948 and 2.69 in Puerto Rico 
in 1948-49. However, kilowatt hours of energy produced per worker 
in hydroelectric plants was over 5 million in the United States in 1942 
in companies of the same size class as in Puerto Rico, and 1.3 million 
in Puerto Rico in 1948-49. 

Yet we cannot say that even these data reflect comparative worker 
efficiency in Puerto Rico and the United States. It is true that these 
are industries in which the products seem to be highly homogeneous, 
relative to the degree of product homogeneity in most industries; and 
they are industries in which the fundamental processes of production 
seem to be closely similar in the two places. But though we have held 
these constant, we have not put all the relevant influences upon the 
statistics, excepting only the efficiency of labor, in ceteris paribus. 

In the fertilizer mixing industry, we encountered product non- 
homogeneity because the number of ingredients mixed in Puerto Rico 
is fewer than in the United States where the use of minor elements to 
prevent soil deficiency diseases is much more common. There is a 
much larger number of distinct mixing formulae in the American 
industry. In addition, there were other variant practices in the two 
places that affected the quantitative relationship of labor input to 
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product output. In the United States, the ingredients pass through a 
second mixing after exposure to the air; in Puerto Rico, they are mixed 
only once. Automatic bag sealing and mechanical conveyance of both 
materials and finished products is much more widespread on the main- 
land than on the island. In Puerto Rico mixing is done to order rather 
than for stock as in the United States, so that there is a lesser number 
of discrete product moves on the island. 

Sugar refining facilities in Puerto Rico are all locationally integrated 
with raw sugar mills and raw sugar is transported to the refineries 
through mechanical conveying devices. Most of the refined sugar out- 
put on the mainland, however, involves the processing of bag-delivered 
raw sugar and a substantial quantity of labor inputs in bag-handling, 
emptying, and washing is dispensed with on the island. A much 
smaller proportion of the Puerto Rican output is packaged in con- 
sumer-size packages. On the other hand, the use of fork lifts and pallets 
in the conveyance of packaged sugar is much more common on the 
mainland, as are automatic devices for packaging and check-weighing. 

In cement product, the Puerto Rican industry saves manhours 
inputs by using oil as fuel, rather than the coal used by many mainland 
plants which requires treatment before use in firing kilns. Power at 
the Puerto Rican plants is purchased from outside power-generating 
agencies and the labor inputs in the generation of power are not 
charged to the cement industry. In the United States, on the other 
hand, power is frequently generated by the cement plants themselves 
and labor inputs appear in the mainland averages but not in those of 
the island. Differential requirements for labor also occur between the 
two places because of variant practices with respect to the proportions 
of cement shipped in bulk and the proportions of bagged cement 
packaged in cloth, rather than paper bags. Differences also occurred 
in the rates of capacity of operation in Puerto Rico and the United 
States and these have great influence on the productivity ratios. 

In the power generating industry the input-output relationship is 
affected by differential topographies. It is impossible to dam up in 
Puerto Rico the enormous quantities of water which are collected in 
reservoirs in the United States; it is therefore much more important 
in Puerto Rico to take advantage of flash flood showers which pour 
over the dam spillways uncontrolled. Also there is much less extensive 
use of automatic controls in Puerto Rico. 

The study of the four relatively highly homogeneous preduct and 
process industries indicates that practices of production and industry 
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structures and even product are in fact characterized by so much 
diversity that it is well-nigh impossible to hold constant the influences 
other than worker skill and effort in a compared situation. And there 
are differences also in size of plant, rate of capacity of operation, the 
character of subsidiary equipment, and the level of skill of manage- 
ment. 

Is it possible to approach the problem of inter-spatial worker ef- 
ficiency comparison by selecting standard operations, important in the 
complex of all operations in a particular industry, and comparing the 
input-output ratio for those operations in the compared places? This 
approach would, of course, confront, from the start; the question of 
selecting standard operations sufficiently representative of the com- 
plex to give meaningful results. Assuming that this hurdle can be 
successfully overcome, however, there is still the question of eliminat- 
ing from the input-output ratio, even at the level of the job operation, 
influences external to worker skill and effort. The influence of manage- 
n ent, for example, can never be completely without effect upon the re- 
sults. This is so, because the worker’s skill will be affected by the qual- 
ity of the training he received and the worker’s effort by the form of 
wage payments made to him and the character of the relationship be- 
tween him and his supervisor. Similarly worker skill and effort will be 
affected by a multitude of such factors as the value system of the work 
group, degree of urbanization, and amount and quality of formal school- 
ing. 

We can, however, quite reasonably distinguish between those in- 
fluences which affect the productivity level through worker efficiency 
and those which affect it in addition to worker efficiency. In a compara- 
tive analysis of inter-spatial worker efficiency, it is only the latter which 
need exclusion. Can we exclude even these in establishing comparisons 
at the job operational level? 

Tentative attempts to use this methodology in comparing Puerto 
Rican-United States worker efficiency also ran into trouble. In the tex- 
tile industry, for example, weaving is a key operation. The output of 
the weaver is measured in number of picks. A pick is a lateral pass of 
the shuttle of filling yarn across the warp y_rn. Yardage of cloth is di- 
rectly proportional to number of picks. On the surface, therefore, it was 
necessary only to compare the pick rates of weavers in the Uuited 
States and Puerto Rico, both working on the same type of cloth, to 
derive comparative efficiency. The number of picks produced per hour, 
however, is not only a function of the skill of the weaver in repairing 
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yarn breaks, but also of the speed of the loom, the brittleness of the 
yarn, and atmospheric conditions in the plant, all beyond worker con- — 
trol. 

Nor can an efficiency comparison be established for weavers by a 
count of looms tended, that is, by a comparison of work loads. Fewer 
looms tended in Puerto Rico was found to retizct, among other things, 
an unwillingness of management to risk lower machine efficiency at the 
price of higher labor efficiency, since the two are in inverse relationship 
to one another and, in the circumstances of current relative prices of 
machines and labor, it seemed wise to pursue a machine-prudent and 
labor-profligate policy. 

It is not suggested that the case of the weavers is representative of 
all occupations with respect to the possible success of establishing com- 
parative efficiency through input-output ratio comparisons at this job 
operational level. What is being entered here is a caveat against the use 
of the technique without care and the drawing of unwarranted assump- 
tions from bare data, unanalyzed. 

No really successful comparisons of worker efficiency, in cardinal 
terms, seem yet to have been made. Given the enormous recent interest 
in backward region development, it is an area in which much research 
can be done. It is necessary to warn, however, that the statistics must 
not be taken at face value, and that the results will be surrounded by 
qualitative limitations. 
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EVOLVING MECHANISMS FOR THE PRODUCTION OF 
INTERNATIONAL HEALTH STATISTICS* 


Harsert L. Dunn 
National Office of Vital Statistics 


IvE and a half years ago, while the World Health Organization 

(WHO) was in the process of being created, I was asked to present 
a paper before the Canadian Public Health Association on the needs 
for international health statistics. In retrospect, I am sure that the 
course charted in that paper was more fanciful than factual. The pat- 
terns and working methods of WHO had not then emerged. It was not 
feasible to consider needs as specific tasks for an adequately staffed 
global health agency and working mechanisms to deal with the specific 
tasks were still in the future. Nevertheless, several broad tasks were 
apparent. 

First, it was self-evident that the reporting mechanism for com- 
municable diseases needed improvements both in completeness and 
timeliness. Five and a half years later, this need is still with us, since 
little or no progress has been made in this area. 

Second, there was a need to establish a standard international sta- 
tistical classification applicable both to morbidity and mortality, and 
to establish the working mechanisms needed to put such a classification 
into world-wide effect. In this area, we have seen very substantial 
progress. A single classification of diseases and causes of death has been 
adopted, and mechanisms are being developed to keep it up te date 
and to help the various countries make good use of it. 

A third task ‘was to strengthen the various mechanisms that produce 
vital statistics and health statistics at national levels. In this area, we 
can point to real progress—the vigorous activities of international 
statistical secretariats to provide technica] and advisory aid, consum- 
mation of the 1950 Census program throughout many nations of the 
World; the emergence of “Point IV” training and exchange activities; 
and the establishment of numerous training centers on the subjects 
of census, vital statistics, and health statistics. 

In addition to these activities, a new and quite unexpected mechan- 
ism—the concept of National Committees on Vital and Health Sta- 
tistics—emerged as a by-product of an International Conference in 1948. 
I shall speak later of the impact and deeper significance of these com- 
mittees, which are proving themselves a most useful mechanism for the 


* Presented at the Annual Meeting of the American Statistical Association, Boston, December 
27-29, 1951. 
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two-way flow of ideas and creative thinking between the national and 
the international levels. 

It is my purpose in this paper to dwell on the progress that has been 
made to date and on an estimation of the tasks and evolving mechan- 
isms in each of these four major spheres of activity; also to discuss 
what might be done to promote a more adequate and fuller use of 
sampling and survey techniques which as yet have not been used to 
any appreciable degree in the field of international health statistics 
although holding forth great promise if they can be adequately de- 
veloped. 


1, NOTIFIABLE COMMUNICABLE DISEASES 


First of all, let us consider the outlook for notifiable communicable 
disease statistics. It seems obvious that international control over the 
spread of epidemic disease has to be one of the principal functions of 
the WHO. Quarantine regulations are no longer effective of themselves. 
In former on when slow-moving boats took weeks or months to go 
from country to country, quarantine regulations did not seem too 
great a hardship relative to the total time spent in the voyage. With 
air travel, it is quite another matter. Travelers are impatient with 
even the few safeguards that now exist and which are quickly gotten 
over. The best of preventive measures will be superficial under such 
conditions, and it therefore becomes extremely important that knowl- 
edge of epidemics and contagious diseases be available with greater 
promptness, completeness, and accuracy. 

Although nothing on a world-wide scale has been accomplished, re- 
gional examples of progress may be noted. In Hawaii, which has be- 
come more or less a crossroads of air traffic for the entire Pacific area, 
the Health Officer is deeply interested in the possibility of establishing 
a Pacific Area Communicable Disease Information Center. Plans are 
under way for a mechanism sufficiently dependable and up to date in 
its epidemic intelligence to permit the installation of safeguards for 
the population of Hawaii against epidemics while they are still thou- 
sands of miles away. Knowledge concerning cholera, plague, smallpox, 
epidemic typhus, and yellow fever—the diseases covered by Inter- 
national Sanitary Convention—is no longer sufficient. Epidemic in- 
formation is also needed to anticipate the introduction of dengue, 
encephalitis, influenza, and other diseases. 

Epidemics and contagious diseases will never be effectively con- 
trolled simply by cutting the lines of communication of disease be- 
tween countries. The sources of some diseases must be sought out and 
eradicated. Consequently, a serious effort must be made to improve 
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the mechanism of communicable disease reporting in all the nations 
of the world. In nations where such a reporting mechanism does not 
exist, it must be established. Promotion of interest in this task and the 
giving of assistance in carrying it out is a task for WHO. 

The Conference on Morbidity Statistics, recently assembled by 
WHO in Geneva, summarized the principal weaknesses of the present 
mechanisms as follows: Incompleteness of notification which aifects— 
frequently to a considerable extent—the value of the statistics of 
communicable disease in many countries; the variety of criteria by 
which the same disease is defined in different countries for the purpose 
of notification; the fact that the lists of communicable diseases re- 
ported vary so considerably; and, as a corollary, the wide differences 
in the data on communicable diseases published in the different coun- 
tries. 

While WHO has done little in recent years to improve the mecha- 
nism of communicable disease reporting, it fully realizes the need to 
improve the mechanism. The Fourth World Health Assembly requested 
the Executive Board: (1) To examine and report on the present ar- 
rangements, and their possible improvement, for the collection and 
administration of epidemiological information in respect to epidemic 
diseases other than the six quarantinable diseases mentioned in the 
Regulations; and (2) to study ways and means for coordinating WHO 
activities on such epidemic diseases, and, for this purpose, the modifi- 
cation of the terms of reference for the present Expert Committee on 
International Epidemiology and Quarantine. 

The recent Conference on Morbidity Statistics recommended that 
this proposed review be extended to cover a critical appraisal of the 
uses of such statistics and of their value not only to the epidemiologist 
and the quarantine official but also to the health statistician. 

Among the unanswered questions are these: To whom are the sta- 
tistics of value? For what reasons? How might they be changed to im- 
prove their usefulness? What are the national obligations involved in 
the production of such data to make them more useful? What are the 
international obligations in terms of analysis, timing, reproduction 
and distribution of the results? To what degree might “epidemic re- 
porting” be developed either as a substitute for or a supplement to the 
epidemiological reports? 

In all probability, this subject will become a major part of the pro- 
gram considerations of WHO during 1952 and 1953. 

Keen interest has been expressed by staff members of WHO and by 
the Morbidity Statistics Conference i in the system of epidemic report- 
ing that has been experimentally developed in the United States. Con- 
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sideration of how this idea might be developed on a world-wide basis 
will doubtiess become a matter for inclusion in any future work done 
on the modification of the communicable disease reporting mechanism. 


2. THE INTERNATIONAL STATISTICAL CLASSIFICATION OF DISEASES, 
INJURIES, AND CAUSES OF DEATH 


Five years ago, the first step had already been taken to remake the 
International List of Causes of Death into an International Statistical 
Classification of Diseases, Injuries and Causes of Death. This audience 
is, in general, familiar with what has happened in connection with the 
Sixth Decennial Revision of this Statistical Classification. 

Even at the Fifth Decennial Revision Conference, held in 1928, it 
was recognized that there was great need for an International Claasifi- 
cation of morbidity terminology of much the same nature as the Inter- 
national List of Causes of Death. Consequently, all nations of the 
world were urged to develop such classifications of sickness, and to ex- 
periment with them with the aim of ultimately internationalizing the 
code. Since the U. 8S. Government had been asked by the 1938 Con- 
ference to create a Committee on Joint Causes of Death, it was able to 
come to grips with the study of morbidity classification as part of its 
assignment. Five years ago, this U. 8. Committee, under the Jeader- 
ship of Dr. Lowell J. Reed and with invited members from Canada, 
England, and the League of Nations, had already developed the first 
draft of a combined classification of sickness and death, and was in the 
process of trying it out in hospitals in Canada, England, and the 
United States. After the first draft had been tested in this manner, it 
was revised and passed over with “no strings attached” to an Expert 
Committee for the Preparation of the Sixth Decennial Revision of the 
International List of Causes of Death which had been appointed by 
the Interim Commission of WHO. The document, revised according to 
criticism obtained from the various countries, became the basis for the 
deliberations of the International Conference held for the Sixth Decen- 
nial Revisior of the Classification. Later the work of this Conference 
was adopted by the First World Health Assembly and, identified as 
Regulations No. 1, was established as a guide to the health agencies of 
Member Nations. 

The emergence of this combined sickness and death classification 
came at a particularly appropriate time, since in many nations there 
has been great interest and activity in the development of medical 
care and insurance programs for health purposes. Without doubt, the 
existence of an internationally acceptable sickness and mortality sta- 
tistica] classification constitutes an outstanding achievement in the 
field of health and has been so recognized throughout the world. 
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Issuance of Manuals of this International Statistical Classification by 
WHO has by no means ended the problems in connection with its use. 
Immediately, as countries began to code their records in accordance 
with it, many difficulties arose and many interpretations were found 
to be necessary. Consequently, the Expert Committee on Health Sta- 
tistics recommended that WHO create a Center for Problems Arising 
in the Application of the International Statistical Classification. This 
was considered by the World Health Assembly and was actually 
established on January 1, 1951, in cooperation with the General Reg- 
ister Office of England and Wales. The activities of the Center were 
many and encompassed correspondence with and visits to countries 
using the International Statistical Classification. The Center also 
carried.on active research on national records with respect to certifi- 
cation of cause of death and morbidity reporting in sickness and hos- 
pital surveys. In ac4ition, it rendered active assistance to various train- 
ing courses for coders and at the same time prepared instructions to 
physicians on medical certification of causes of death. These instruc- 
tions are being circulated throughout the world. As a result of these 
discussions and deliberations, a pamphlet has been prepared that will 
give supplemental interpretations of the Manual and instructions for 
coding the causes of death. The achievements of the Center in its first 
year of operation have been so outstanding that both the Conference 
on Morbidity Statistics and the Expert Committee on Health Sta- 
tistics have urged that it be continued as a regular activity of the 
WHO Secretariat. 

While the great contribution of the WHO Center is widely appreci- 
ated, experts of many countries have been fully aware that activities 
of this nature are by no means a substitute for a full-scale decennial re- 
vision of the International Statistical Classification since such a periodic 
revision is needed to keep it abreast of medical progress and public 
health needs. After deliberating on these matters, the Expert Com- 
mittee on Health Statistics decided to propose to the World Health 
Assembly that future revisions of the International Statistical Classi- 
fication should be held decennially and that no changes should be made 
in the 3-digit category of the detailed list of the Classification in interim 
periods; moreover that, in the preparatory work of the revision, full 
opportunity should be given for consultation with national administra- 
tions, culminating in an international revision conference to be held 
under the auspices of the WHO. It has been further pointed out by a 
number of nations that the best time to undertake a revision of the 
List would be in the middle of the decennial period, so that the manual 
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itself would be available for use around the decennial census period 
when many special studies, of both sickness and death, can be made. 
For population base data, which are needed for the calculation of 
rates, become available only at that time. 


3. STRENGTHENING OF NATIONAL VITAL AND HEALTH STATISTICS 
ORGANIZATIONS - 


It is difficult to realize, in a country like the United States, how 
scattered and rudimentary are many of the sources for morbidity and 
mortality statistics in some of the countries of the world. In this 
country, we obtain our morbidity statistics from a great variety of 
sources. Vital statistics, in particular those coming from the death 
certificates, are of great use in planning for public health programs. 
Likewise, sickness surveys of one type or another have proved their 
value for planning and evaluating programs. The communicable disease 
mechanism is nationwide and, although it has many defects, gives sub- 
stantial information for the control of epidemic disease. In certain dis- 
eases, a registration procedure provides a control and follow-up mech- 
anism with statistical by-products. Specialized types of morbidity data 
come from the records ot traffic accidents, of industrial and occupa- 
tional accidents and diseases, of general hospital and clinic patients 
and out-patients, of a great number and variety of voluntary health 
insurance and pension plans, and of social security, not to mention 
the records of physicians themselves, particularly those engaged in 
operating interrelated clinics. The variety and abundance of medical 
data that exist in this country must be considered an exception rather 
than a universal rule. A great majority of the countries of the world 
are fortunate indeed if they have a recent population census base and 
a vital statistics system of some sort for the country as a whole. While 
in most countries certain cities have hospital and health data available, 
few have complete coverage and must be classified as countries with 
relatively underdeveloped health statistics resources. 

In the spring af 1951, at the request of the Statistical Commission, 
the Statistical Office of the United Nations undertook a world-wide 
survey of vital registration and vital statistics practices, to aid in the 
formulation of a set of standards for vitai registration and statistics 
throughout the world. This survey revealed wide discrepancies and 
major gaps in information of basic importance to the field of public 
health. Both the Statistical and Population Commissions have gone 
on record that steps should be taken to elaborate this report and to 
develop guiding principles for the establishment of an international 
vital statistics system. These actions were endorsed at the recent meet- 
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ing of the Expert Committee on Health Statistics, and it is safe to say 
that solid work in this respect will proceed in the years immediately 
ahead. 

Perhaps the greatest single advance in the past five years in the 
strengthening of national morbidity statistics has been the extensive 
effort to produce satisfactory population census information through- 
out the world. 

Census data are of great importance in the use of sickness and death 
data, since they afford the base upon which rates can be computed. 

The most concerted effort in connection with the census was brought 
about in the American nations through a five-year continuing program 
termed the “1950 Census of the Americas.” This program, stimulated 
primarily through the activities of the Inter American Statistical In- 
stitute, was carried forward under the auspices of a 22-member com- 
mittee (one from each of the American nations), with the cooperation 
of various international organizations. Under it, each country, in order 
to attain inter-American comparability of results, agreed to use certain 
basic minimum standards as to census questions, definitions, and pub- 
lished tabulations. 

The other major development in the field of helping nations and or- 
ganizations that produce vital and health statistics has been a series 
of training centers sponsored primarily by United Nations but coi- 
laborated in by various other agencies, both international and national. 
The recent Biostatistics Seminars held in Santiago and Cairo are ex- 
amples. The effort to aid national agencies in the improvement of their 
vital statistics and health statistics procedures will undoubtedly re- 
ceive great stimulus from the “Point IV” Program as it develops. The 
general policy laid down in the enunciation of this program was that it 
should make available to all the countries of the world, and particularly 
to underdeveloped areas, the scientific knowledge necessary to bring 
about improvement and growth that would help the people of such 
areas realize their aspirations for a better life. It has not been easy to 
develop this general philosophy into a concrete program for action. 
However, the program is now being put into effect and will involve 
elements of effort and action for the secretariats of international agen- 
cies as well as for various organizations functioning at the national 
level. In general, I believe it can be concluded that the need to 
strengthen national facilities so that they are capable of producing the 
vital data and the health statistics necessary for international health 
programs is a basic concept. Without strong national facilities, cer- 
tainly an adequate international mechanism for the production of 
health statistics cannot be obtained. 
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4. NATIONAL COMMITTEES ON VITAL AND HEALTH STATISTICS 


The concept of National Committees on Vital and Health Statistics 
was a direct outgrowth of the Sixth Decennial International Conference 
for the Revision of the International Statistical Classification of Dis- 
eases, Injuries and Causes of Death. This Conference was held in Paris 
during the spring of 1948. The Conference, whose agenda had been 
planned ‘o extend over a 5-day period, enjoyed the unusual experience 
of finishing all its work in the first two days. Participants realized that 
this unusual situation stemmed directly from the fact that a major 
contribution had been made by the United States Committee on Joint 
Causes of Death which had laid the basic groundwork for approval of 
the Classification. The participants realized that the funds and tech- 
nical skills contributed by national programs to achieve this purpose 
could not possibly have come from the resources available at that time 
to international agencies. They concluded, therefore, that this method 
of working could probably be used to great advantage in the future on 
the many ~roblems facing public health statistics. After considering 
the program, the delegates at Paris decided that all countries should 
establish National Committees on Vital and Health Statistics, as a 
means of working toward solution of some of the national and inter- 
national statistical problems, and that the results of their efforts should 
receive consideration by the Expert Committee on Health Statistics of 
WHO. 

This philosophy was endorsed by the World Health Assembly, and 
the WHO invited all nations to create National Committees so that 
they might initiate, as well as participate in, studies in the establish- 
ment of international statistical standards. Once the idea was launched, 
it spread rapidly, and now many of the rations of the world have estab- 
lished National Committees which are actually working on vital sta- 
tistics and health statistics problems. 

The recent Conference on Morbidity Statistics gave considerable 
impetus to the use of this mechanism by assigning to National Com- 
mittees important study jobs. Many of these tasks involved basic 
work in connection with defining terminology to be used in the field of 
morbidity, as, for example, the definition of terms of a general nature, 
those for special use, and others dealing with methods of classification 
according to type, severity, duration, kind of medical attention, etc. 
A different set of definitions will be studied in connection with vari- 
ous kinds of morbidity statistical rates. Other tasks assigned to Na- 
tional Committees for study concerned morbidity subjects such as the 
criteria of patterns and designs for surveys of sickness, for the produc- 
tion of hospital statistics, the improvement of mechanisms for the re- 
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porting of notifiable diseases, and the adaptation of the International 
Statistical Classification of Diseases, Injuries and Causes of Death to 
specified problems. | 

Both the Conference on Morbidity Statistics and the Expert Com- 
mittee on Health Statistics expressed the view that it was quite im- 
portant to hold an International Conference of Representatives of Na- 
tional Committees at an early date. The plan would be that govern- 
ments should send delegates to such a conference at their own expense, 
but that the staff work needed for its preparation would be carried out 
by the WHO. Many of the problems assigned to National Committees 
would automatically be scheduled as subjects for discussion and recom- 
mendation at such a Conference. 

It is the feeling of those who are close to the problems involved 
that the mechanism of national committees is an exciting new idea 
which will make it possible for healtl: statistics problems at the local 
level to be considered and dealt with on an international scale. It 
reverses the usual process of setting standards through expert com- 
mittees and then sending them to the nations, and substitutes a two- 
way flow of ideas between the “grass roots” and the international level. 
It holds the opportunity for a truly cooperative enterprise by na- 
tional and international technicians in the field of vital statistics and 
public health statistics. 

The Expert Committee, after reviewing the very substantial progress 
made in the organization and conduct of National Committees, ex- 
_ pressed its strong conviction that the proposed International Confer- 
ence should be held early in 1953, if possible, and that a major element 
on the agenda should be a careful review of the objectives, organiza- 
tional pattern, program, and working relationships of the National Com- 
mittees with one another and with the various international agencies. 
Such a review was considered desirable in order that guiding principles 
for these bodies in their mutual interrelationships could be laid down 
in a way to bring about the maximum over-all effectiveness in resolving 
international statistical problems. 


5. THE IMPORTANCE OF THE SURVEY TECHNIQUE 


Not until the current year has a realization emerged in the thinking 
of public health statisticians that the future development of satisfac- 
tory international health statistics depends in large part upon the use 
of health survey and sampling mechanisms. In discussing many of the 
problems concerned with producing satisfactory health statistics, the 
Conference on Morbidity Statistics continually came back to the idea 
that substantial progress would be difficult unless survey and sampling 
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techniques were developed. These methods are fairly new, even for 
countries having highly developed health statistics; and yet they would 
be adaptable not only to such nations but also to countries in which 
health data and vital statistics are primitive or nonexistent. 

Mortality statistics coming from registered documents and the data 
obtained by the reporting of notifiable communicable diseases exist to 
some extent in most countries. In countries without these methods it 
is not an easy matter to install them. The building of a satisfactory 
vital statistics system takes years of effort, together with substantial 
investments in terms of money and training. In nations with little or 
nothing to build upon in terms of established mechanisms for the 
production c* health statistics, it would seem that the survey procedure 
holds out the best hope for the development of such data with reason- 
able promptness. In countries that have many sources of morbidity 
data, the types of information obtained through surveys would pro- 
vide a means of materially broadening the interpretation of specialized 
types of statistical information. 

The Conference on Morbidity Statistics considered these facts, in 
particular that the survey method covering either a whole population or 
a representative sample would appear to be a promising method for 
obtaining various types of health statistics not available by any other 
means. To move in this direction, however, it was realized that it 
would be necessary to draw from “pools” of technicians properly 
trained in sampling theory and familiar with the application of this type 
of technique to practical health problems. As a consequence, a prin- 
cipal action of the Conference was to recommend experimentation 
with the survey method in various countries and the establishment by 
WHO of a suitable health demonstration project in an underdeveloped 
area so as to obtain experience in the conduct of sickness surveys 
under such conditions. It further recommended to national health 
agencies that a group of experts in sampling theory be established 
within their organizations. The great advantage of this proposal is 
that it would result in a body of expert skills and technical experience 
that could be drawn upon in planning surveys within the countries 
and also be loaned internationally under appropriate conditions. 

In conclusion, it seems to me that we can take considerable satisfac- 
tion in the progress that has been made in international health sta- 
tistics over the last five years, and in the fact that a substantial area of 
study has been defined for the next 10-year period that should result 
in further progress. The most difficult single element in achieving this 
end will be the establishment within our national health agencies of the 
sampling and survey skills needed for the solution of both national and 
international health statistical problems. 
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A GENERALIZATION OF SAMPLING WITHOUT 
REPLACEMENT FROM A FINITE UNIVERSE* 


D. G. Horvitzt D. J. THompson 
Iowa State College 


This paper presents a general technique for the treatment 
of samples drawn without replacement from finite universes 
when unequal selection probabilities are used. Two sampling 
schemes are discussed in connection with the problem of de- 
termining optimum selection probabilities according to the 
information available in a supplementary variable. Ad- 
mittedly, these two schemes have limited application. They 
should prove useful, however, for the first stage of sampling 
with multi-stage designs, since both permit unbiased estima- 
tion of the sampling variance without resorting to additional 
assumptions. 


INTRODUCTION 


HEN sampling a finite universe in which we can identify the indi- 

vidual elements, we are free to assign in a completely arbitrary 
manner the probability of selecting an element on any particular draw. 
By appropriate assignment of the selection probabilities it is possible 
to reduce considerably the sampling variances of unbiased sample 
estimates over those obtained when sampling with equal probabilities 
throughout. 

The possibility of using unequal probabilities for selecting the sample 
elements from the universe as a means of increasing precision perhaps 
received its first impetus for applied sampling from Hansen and 
Hurwitz [2] in 1943. They introduced the selection of primary units 
(in a subsampling scheme) with probabilities proportionate to some 
measure of their size and presented the appropriate theory. Their 
sampling scheme was confined (when sampling without replacement) 
to samples of one primary unit per stratum, however, the theory not 
having been extended beyond this point. More recently, Midzuno [6] 
has generalized the Hansen and Hurwitz approach to sampling a com- 
bination of n elements of the universe with probability proportionate 
to some measure of size of the combination. Madow [5] has made 
some contributions to the theory of the systematic selection of several 
clusters with probability proportionate to a measure of size. 


* Journal Paper No. J 2139 of the Iowa Agricultural Experiment Station, Ames, Iowa, Project 1005. 
Presented to the Institute of Mathematical Statistics, March 17, 1951. 
t Now at the University of Pittsburgh. 
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Research in the theory of sampling for surveys has been concerned 
with the development of more efficient sampling systems, the system 
including both the sample design and the method of estimation. One 
sampling system is said to be more efficient than another if the vari- 
ance or mean square error of the estimate with the first system is less 
than that of the second, provided the cost of obtaining the data and 
results is the same for both. The development of stratified, multi-stage, 
multiphase, cluster, systematic, and other sample designs beyond 
simple or unrestricted random sampling, as wel! as alternative meth- 
ods of estimation, have all resulted in increased efficiency in specific 
circumstances. As indicated above, the appropriate use of variable 
probabilities for the selection of the sample elements can lead to gains 
in efficiency over systems using equal probabilities of selection. 

It is well known that if samples of size one are drawn with prob- 
abilities proportionate to the exact measure of the characteristic 
under observation, unbiased estimates of means or totals for the popu- 
lation exist which have zero sampling error. Similarly, Midzuno pro- 
vides an unbiased estimator:for his design which has zero sampling 
error when the samples are drawn with probabilities proportionate to 
the total measure of the elements in each for the characteristic observed. 
Since in practical situations the values of the characteristic under study 
are not known in advance, the problem arises of determining the selec- 
tion probabilities (from any additional information available) which 
have optimum properties, i.e. maximize the efficiency. Midzuno [6] 
and Hansen and Hurwitz [3] have both considered this problem with 
some success. 

A limitation of the Hansen and Hurwitz scheme is that an unbiased 
estimate of the sampling variance of their estimator cannot be obtained 
from the sample elements. This difficulty appears to exist in Midzuno’s 
system as well, except in the trivial case of equal probability for each 
sample combination. 

The purpose of this paper is twofold. First, it provides a general 
method for dealing with sampling without replacement from a finite 
universe when variable probabilities of selection are used for the ele- 
ments remaining prior to each draw. An unbiased linear estimator for 
the population total of the characteristic measured is given, as well as 
the sampling variance of this estimator. An unbiased estimator for the 
sampling variance is also given. This is for a one-stage design. An ex- 
tension of the use of this method for two-stage sampling is presented. 
Second, it examines and discusses some of the problems arising in 
the practical application of sampling with variable selection probabili- 
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ties. In this connection, two sampling schemes, for samples of size two, 
using unequal selection probabilities are presented. Although general 
use of these schemes is limited because of the small sample size, they 
should have wide application for the first stage of sampling with 
stratified two-stage designs. Either of the two schemes permits an un- 
biased estimate of the sampling variance to be made from the sample 
data without resorting to additional assumptions. 


SAMPLING WITH ARBITRARY PROBABILITIES OF SELECTION 


Let the universe, U, consist of N elements w, ve, - - - , uv. A sample 
of size n is to be drawn without replacement using arbitrary probabil- 
ities of selection for each draw. We denote the probability of selection 
associated with the ith element of the universe prior to the first draw 
by pi, @=1, 2,--+, N), where 


N 
pi, 2 0, =1. 
This, in a sense, defines a probability distribution (of selection) for 
the elements of the universe for samples of size one. We are sampling 
without replacement so that prior to each succeeding draw we must 
define a new probability distribution for the remaining elements. These 
may be based on the initial probabilities or, in fact, can be a com- 
pletely unrelated set. For the mth draw we shall designate the prob- 
abilities of selection by p;:,, where, as above, 


p,20, Lip, =1, 
but the summation now extends only over the N—m-+1 remaining 
elements.! We will denote the n sets of selection probabilities by 
(1) {Din}; m=1,2,-+-+,n. 


Knowing the probability distributions used at each draw, it is pos- 
sible to compute the a priori probability that the ith element (i.e. u;) 
will be included in a sample of size n. This probability will be desig- 
nated notationally by P(u;). It is well known that 


N 
(2) P (ui) =n 
t=1 


1 Actually, sampling without replacement as considered here is the special case of sampling with 
replacement which arises when the elements once selected have probability zero of being chosen on any 
succeeding draw. 
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rather than one since we are not summing probabilities of mutually 
exclusive events, except for samples of size one. 


There are (*) cittrent samples when 7 elements are drawn with- 


out replacement from a finite universe of N elements (assuming that 
at each stage of the draw all remaining undrawn elements have a 
probability greater than zero of. being selected). Consider now the 
number of possible samples when the order of draw is taken into ac- 
count. Since each different sample could occur in n! different orders 


there are n! (")=s possible samples, considering order. Denote by 


8,(s=1, - --, 8S) the sth such sample of size n. The probability that 
8, Will be drawn is given by the product of the probabilities of selection 
of the elements in the sample considering the order of the draw. Thus, 
if s, contains the elements w;, u;, - - - , u: drawn in that order, then 


(3) Pr (Sn) = * *-Pty- 
The probability, P(u;), of including element u; in the sample plays 


the fundamental role in the theory developed in the following sections. 
For a sample of size n, P(u;) reduces to a summation of the prob- 


abilities associated with the n! ges =S samples that contain u;. 


Notationally, we have 


(4) P(u:) = Pr 


where we are designating a specific sample of size n which includes 1; 
by sn. 
The extension to the a priori probabilities of including both the ele- 
ments u; and u; in a sample of size n follows readily. Thus 


(5) P(usuj) = Pr 


since there will be n! = such samples, designating a spe- 


cific one. 


EXPECTED VALUES OF SUMS AND PRODUCT-SUMS 


Suppose now that we are to measure a characteristic X for the n ele- 
ments in the sample. Denote by X; the value of X assumed by element 
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yu; The X,’s are not necessarily all different, of course. The expected 
yalue of the sum of the observed values of X in the sample is then 


n 8 n 
i=1 i=l 
_.woring the z; common to the ith element u; and summing over the 
population, we have 


E = x Pr 


N 
he P(ui)Xi. 


Note that for sample sums, 2; refers to the value of X for the element 
selected on the ith draw. It follows readily that 


n N 

B( 2. p x P(ui)X%. 
t=1 t=1 

The expected value of the sum of cross products 2;2;, 1~j, is given by 


g(t) 


X:X; Pr [s“,] 


N 
P(uwu;)X:X;. 
isi 
Also, of course, 


n N 
ix] 
It is to be noted that the process of taking expected values of sums and 
product-sums reduces to summing the product of the particular func- 
tion of the observed values by the appropriate a priori probability 
over the elements of the universe. The extension to triple product- 
sums and higher should now be clear. 


ESTIMATION OF THE POPULATION TOTAL 


The question of what to use for the estimation of population char- 
acteristics when sampling with arbitrary probabilities of selection at 
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each draw naturally arises. We will restrict ourselves here to using an 
unbiased linear estimator of a certain class for the population total of 
the characteristic X. 

Actually a number of subclasses of linear estimators exist when 
sampling a finite universe without replacement. For example, for 
estimating (from a sample of size n) the population total of X, ice. 


N 
T= pe 


t=1 


we could consider using either 
= 


where a; (i=1, ---, ™) is a constant to be used as a weight for the 
element selected on the ith draw; or 


= Bixi, 


where §; ({=1, - - - , N) is a constant to be used as a weight for the ith 
element whenever it is selected for the sample; or 


t=1 
where ¥,, is a constant to be used as a weight whenever the s,,th sample 
is selected. It should be noted that the a coefficients are independent 
of the particular sample that is selected. However, the @ and y coeffi- 
cients, although known constants for a specified sampling procedure, 
depend on the particular sample selected. 

It is the usual procedure, whenever a linear function of n inde- 
pendent random variables is desired as an estimator of some population 
parameter, to choose the one which has the smallest variance among 
those that are unbiased. The resulting estimator is then classed as the 
best linear unbiased estimator. We have indicated above only three 
of the possible subclasses of linear estimators of 7’ when sampling a 
finite universe without replacement. The determination of the un- 
biased estimator which has minimum variance within each of these 
subclasses is straightforward. The general solution to the problem of 
determining the best linear unbiased estimator, however, when 
sampling a finite universe without replacement and with arbitrary 
probabilities of selection has not been considered by the authors. We 
observe here, if 
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(i) there is only one-stage of sampling, 
(ii) the individual elements of the universe can be identified in ad- 
vance, 
(iii) information on any supplementary variables for use in the esti- 
mation process is lacking, and 
(iv) there is no advance knowledge of the values of the characteristic 
to be measured, 
that a general solution is lacking even in the case of,equal probabilities 
of selection. In connection with this remark, although it can be easily 
shown, when sampling with equal probabilities of selection for each 
draw, that the a’s, 6’s, and y’s are all equal to N/n for the best linear 
unbiased estimators of 7’ for each of the three subclasses, this is cer- 
tainly not sufficient to claim 


as the “best” among all possible linear unbiased estimators of 7’. 

We? will restrict ourselves here to the subclass of linear estimators for 
the population total of X given by 72. In order that 7, be unbiased we 
must have 


E(?.) = T 


and, hence, 


P(u;)BiX; = Xi. 


t=1 


In order for this equality to hold whatever be the values of the un- 

known X’s, we must have 
P (ui) Bi =] 

for all 7. Therefore, 

(6) -> 


i=1 P 
is the only unbiased linear estimator possible in the subclass under 
consideration and hence is “best” for that subclass. Note that if 
nX; 


(7) P(ui) = 


T will have zero variance and the sampling wili be optimum. 


2 Midzuno uses an estimator belonging to the subclass specified by 7. 
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Using the results obtained in the section on expected values the 
variance of T, say V(7), follows readily. Thus, 


V(T) = E(T — T)? 


®) -2 + P(u) 
1- P(uuj) — P(us)P(u;) 


(9) X?; + > X;X; 
P(u;)P(uj) 
This formula applies only when every element has a positive probability 
of inclusion in the sample, however (i.e. P(u;) >0 for all 7). 
An unbiased estimator of the variance of T in the general sampling 
procedure is also readily obtainable, provided n is greater than one. 


Thus, 
P(ui) 


P(uu;) — P(u:)P(u;) 
10) V(T) = ———_—. + Lid; 
Again, this formula is restricted to those sampling schemes which yield 
positive probabilities of inclusion for every element and every pair of 
elements (i.e. both P(u;) and P(u;u;) greater than zero for all 7 and j). 
Alternatively, we may write (10) in the form 


(11) V(T) = . 
P(ui) P(uius) 

If an unbiased estimate of the population mean is desired, it is suffi- 
cient to divide the unbiased estimator of the population total, (6), by 
N. The sampling variance of this estimator is the same as (8) except 
for an additional factor of 1/N?. 


APPLICATION TO KNOWN SAMPLING DESIGNS 


The general nature of this approach to sampling a finite universe 
without replacement will be illustrated by considering the estimator 
and its sampling variance, as derived in the preceding section, for 
simple random, systematic, and stratified random sampling procedures. 

With simple random sampling or equal probabilities of selection for | 
the elements remaining prior to each draw, we have 


ma 
P 
| 
| 
| = 
4 { n 
(u) = — 
i) N 
1,2,---,N) 
P(uu; = n(n — 1) 
N(N — 1) (i,j =1,2,--- 
Jj). 
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Substitution of these inclusion probabilities in formulas (6) and (8) 
yields the estimator 


N 


with sampling variance 


~ TY 


In addition, the estimator for the variance of 7 in the general case, 
formula (10), reduces to 


~~ T\? 


These derived expressions agree with the formulas usually pre- 
scribed for the respective quantities when the sample is selected at 
random without replacement. 

To illustrate the arvlication of the general results to systematic 
samples, we conside. <he c<implified case of a universe of N =kn ele- 
ments. A systematic sample is obtained by selecting every Xth ele- 
ment following the choice of a random starting point among the ele- 
ments numbered 1 through k. The measured value, of the characteristic 
of interest, associated with the jth element in the 7th possible sample is 
denoted by X,;. 

It follows readily that 


P(ui;) = 


for all ¢ and j, 


= 


for 1=7', and 
P(uijui jr) =0 


for all other pairs of elements. Formula (6) again yields the usual 
estimator 


Ne 
Xi =— DXi 
j=l 


| 
n 
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for the population total, the subscript i denoting the sample chosen, 
The sampling variance of 7, as given by (8), namely 


k n k n 
VT) DY — 7, 
fu 


_ is also equivalent to the usual formula. This is most easily seen by ex- 
panding the particular form 


where 


of the usual variance formula for systematic samples. (See, for example, 
L. H. Madow [4]). Since certain pairs of elements have no chance of 
being included together in a sample with this systematic design, 
formula (10) cannot be used to estimate the sampling variance of 7 
from the sample data. 

The variance formula (8), left in its expanded form, provides an inter- 
esting method for examining the conditions under which one sampling 
system will be more efficient than another disregarding costs. To 
examine the efficiency of a systematic sample versus a random sample 
we note that the respective variance formulas differ only in the middle 
term of (8). Thus a systematic sample will be more efficient (the par- 
ticular estimator chosen will have a smaller variance for systematic 
samples than for random samples) if 
k n n— 1 k n 


—1 we ims 


Following some algebraic manipulation, this condition reduces to 


where y; is the mean of the ith sample as defined above and 
(Xi — 
j=1 


n—1 
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Essentially, then, a systematic sample will be more efficient than a 
random sample if the variation between the possible systematic 
samples is less than the average variation within the possible samples. 
It should be noted in particular that (12) is exactly equivalent to the 
well-known condition of an intraclass (intrasample, in this case) cor- 
relation coefficient less than —1/(N—1) for an efficient systematic 
sample relative to a random sample. Further examination of (12) leads 
rapidly to several of the other known conditions for gains with a sys- 
tematic sample. 

When the universe elements have been classified into K strata and a 
random sample selected from each stratum, substitution of the inclu- 
sion probabilities for individual elements ard pair of elements in (6) 
and (8) again yields the usual formulas for the appropriate linear un- 
biased estimator and its sampling variance. If u;; denotes the jth ele- 
ment in the ith stratum, N; the number of elements in the ith stratum, 
and n; the number of elements selected for the sample from that 
stratum, the inclusion probabilities are 


P(uij) = 


ni(n; 1) 


P (ui jus = 


for i=7’, jj’, and 
NiNy 


= = Ne 


for"all j and j’, 

Whereas the above results point out that for the schemes considered 
the possible estimators 7, and 72 are equivalent, this will not be true 
in general. It should be noted that for each of these schemes the prob- 
ability of including a particular element in a sample is the same either 
for all the elements of the universe or for all the elements of the same 
sub-universe. 


EXTENSION TO A TWO-STAGE SAMPLING DESIGN 


The extension of the use of arbitrary probabilities of selection for 
each draw to designs involving more than one stage has been examined 
for a spe~ial case only. The universe now consists of K primary sampling 


for all j 
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units with the 7th such unit containing N; secondary or subsampling 
units. Let X;; be the value of some characteristic X of the jth subsam- 
pling unit of the ith primary sampling unit. The population total 


K 
(13) Xi 


fond 


is to be estimated from a sample of k primary units, n; subsampling 


primary units are drawn without replacement using arbitrary prob- 
abilities of selection for each draw. An over-all sampling rate, say t, 
is specified in advance and the n; determined from the relation 


tN; 

(14) n= 

P (ui) 
where P(u;) now denotes the a priori probability that the ith primary 
unit will be included in a sample of & such units. The subsampling 
units are to be drawn without replacement with equal probabilities of 
selection for those remaining prior to each draw, i.e. at random. This 
sampling procedure is entirely analogous to that specified by Hansen 
and Hurwitz [2] (ignoring area substratification) when a single primary 
unit is drawn with probability proportionate to its estimated size. 

One difficulty that arises with this scheme concerns the n; as de- 
termined by (15). In most practical applications this relation will not 
yield integral subsampling sizes. We will neglect the bias introduced by 
choosing the closest integral value for n; in what follows. It should 
also be noted that the N; need not be known in advance of the primary 
unit selection stage of the draw. 

Since every subsampling unit will have the same chance of being in- 
cluded in the sample, it follows that : 


1 kn 
— 


i=1 j=1 


(15) T= 


will provide an unbiased estimate of 7. The variance of this estimator 
(provided P(u;)>0) follows readily from the previous results. Thus 


i uj) — Plu; i 
1 Plus) (u;)P(u;) 


K 
= 2 t P(u;)P(u;) 


i=l 


} 
obs units to be drawn from the ith primary unit if it is in the sample. The 
4 
(16) +2, 
7 i 
. 
= 
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ling 
om. where 
Ni 
= Xu = Nui 
j=1 
and 
ling jar — 1) 
~ The first two terms of the right member of (16) make up the usual be- 
yt tween primary unit component of variance, the last term being the 
within component. 
An estimate of this sampling variance may be computed from the 
i) elements in the sample, the estimator provided here having the prop- 


erty of unbiasedness. Thus (when the P(u;) and P(uju;) are all 
greater than zero) 


ary 
1 
ng 17) V(T) =T?- Ni — 
his h 
ars where 
ry 
ot 
by 

An unbiased estimate of the between component of variance may be 
n- obtained by using V(T) in conjunction with an unbiased estimate of 


the within component, the latter being given by the quantity 


3 P(ui) 


or 


SOME ASPECTS OF THE PRACTICAL APPLICATION OF THE THEORY 


The remaining sections of this paper will be devoted to examining 
some aspects of the problems arising in connection with attempts to 
utilize the preceding theory in practical applications. To simplify the 
exposition, attention will be confined to a one stage sampling scheme; 
however, various extensions of the results to multi-stage stratified 
designs are evident. 

In the estimating functions (6) and (15) as well as the corresponding 
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expressions for the variance and sample estimates of the variance, it 
will be noticed that it is the quantities P(u;) and P(uiu;) that can be 
controlled by the sampler. Assume that a variable Y reasonably cor- 
related with X is known for each element of the universe, and that the 
sampler wishes to utilize the information in Y in assigning the selec- 
tion probabilities (1), so that the resulting P(u;) and P(uju;) will lead 
to a reduction in variance. The three main problems that arise in this 
connection consist of 
(i) determining the quantities P(u,u;) that will minimize the vari- 
ance (9) (specifying the P(u,u;) determines the P(u;) since 
P(uiu;) =(n—1)P(u,), as may be easily verified), 
(ii) defining the sets (1) to achieve the P(u,u;) thus determined, and 
(iii) investigating the conditions required on the relationship be- 
tween Y and X to obtain gains in efficiency over sampling sys- 
tems employing the information in Y in alternative ways. 
These three problems are not independent and a general solution has 
not been reached by the authors. Some progress has been made in par- 
ticular cases, however, and it is hoped that a discussion of these cases 
will provoke interest and stimulate others to investigate sampling sys- 
tems of the type here considered. 
Considering first the problem of assigning the P(u;u;), as a first ap- 
proximation to an “optimum” assignment, we may require only that 


1 N N 
(18) P(uju;) = P(u;) = Yi. 


If the X; are approximately proportional to the Y;, this assignment may 
be expected to lead to an estimator with small variance, as may be seen 
by assuming strict proportionality between X and Y and noting that 
(6) is then identically 7’. From examination of (9), it appears that the 
assignment of the P(u;u;) (in terms of the Y;) that leads to minimum 
variance depends upon the joint distribution of X and Y. This compli- 
cates the problem. In the examples discussed in a later section, however, 
it will be demonstrated that a substantial reduction in variance can be 
achieved by an assignment of the type indicated in (18). 

The problem of determining the sets of selection probabilities (1) 
that will yield preassigned “optimum” values of P(uju;), or P(ui) 
as in (18) above, can be illustrated by a sample example. Suppose a 
sample of size 2 is to be drawn without replacement from the universe 
of 6 elements given in Table 1 in such a manner that P(u;) 
=2Y;/ >-i-1° Y;. In the notation of the previous sections, n sets {p:,.}, 
(m=1, 2, - - -,n), each satisfying 
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Osp, 31 
N-—m+1 
= 1, 

must be defined, so that when the n draws have been performed ac- 
cording to these sets, the probability that u; will be included in the 
sample will be the assigned probability P(u;). The notation adopted 
for the sets of selection probabilities is not entirely satisfactory since it 
does not indicate the dependence of the set used for selecting the mth 
element in the sample on the results of the previous m—1 draws. In 
Table 1, columns 5 and 6, this dependence is explicitly indicated by 
using the notation commonly employed for a conditional probability. 
Thus, 


mes} 


is the probability assigned to the selection of wu; on the second draw, 
given that u; has been obtained on the first draw. 


TABLE 1 

(1) (2) (3) (4) (5) (6) 
Us Y; P(us) {pis} {pig| usr es} {piq| 8} 
1 32 .64 0 
3 23 -46 4 0 
3 17 .34 0 oo 4 
4 13 .26 0 
5 10 .20 0 
6 5 .10 0 Al all 

100 2.00 1.0 1.0 1.0 


It may be easily verified that the selection probabilities defined in 
columns 4, 5, and 6 of Table 1 do achieve the inclusion probabilities in 
column 3. The solution indicated is one of an infinity of solutions and 
was chosen primarily for its simplicity. The authors are not aware of 
general methods for examining the consistency of systems of equations 
of the type used in obtaining columns 4, 5, and 6 or of finding simul- 
taneous positive solutions when they exist. For the solution given 
P(uu;) =0 for i4+7=3, 4, 5, 6, and formula (10) is therefore not ap- 
plicable. 

It should be noted that the P(u;) are probabilities satisfying 
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0 < <1 


N 
P(u;) =n, 
t=1 


so that if the “measures of size” are such that for any element, say u, 
the quantity nY,/ >-:-1." Y; is greater than unity, no method of draw- 
ing the sample exists which will give P(u:)/ >-i-1" Yi. This situation 
can be obviated in various ways (stratification, subdivision of the ele- 
ments of the universe, etc.) so-that it need cause no difficulty. 

A secondary consideration of practical importance in defining the sets 
{pi} by the general method indicated above is that particular solu- 
tions may not facilitate the computation of the quantities P(u,u;) re- 
quired for estimating the variance (10). To calculate this expression the 


quantities P(u,u;) must be determined for the (3) combinations of the 


sample elements. For n of any considerable size the direct calcula- 
tion of the P(u;u;) by summing the probabilities associated with the 
samples containing u; and u; is impractical. lt would thus seem ad- 
visable to restrict further the choice of selection schemes to those 
schemes that permit ready calculation of P(uiu;). 

A selection scheme that obviates the problem of explicitly defining 
the set (1) yet satisfies (18) is mentioned by Goodman and Kish [1]. 
The N universe elements are listed in a random order and their meas- 
ures of size are cumulated. A systematic selection of n elements 
from a random start is then made on the cumulation so that 
P(u:)=nY;/ Y;. This selection is easily pertormed, but there 
does not appear to be any simple way to determine the P(u,u;). 


Sampling Scheme 1 


A method of defining the set {p:,} that yields an exact solution 
under certain conditions can be developed in the following way. 
Consider drawing a sample of n elements from a universe of WN ele- 
ments without replacement, where the first element is selected accord- 
ing to the set p;,, (i=1, 2,---,N) pi,=1, pi,>0. Ai the sec- 
ond and all remaining (n—1) stages of the draw equal probabilities 
are assigned to the elements remaining, i.e. the set p;, consists of N—1 
equal elements 1/(N—1), the set p:,, N—2 equal elements 1/(N —2), 
etc.’ By simple combinatorial analysis we find that 


3 Midzuno suggested using this scheme for drawing the sample in connection with his sampling 
system on a recent visit to the Statistical Laboratory, Iowa State College. It may be mentioned that 
with this method of drawing, each of the possible (¥) different samples has a probability of being 


selected proportional to the total of the measures of size for the elements in the combination, which is 
desirable in his system. 
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P(ui) = 1)! 
—1)! , 
which upon simplification becomes 
n—1 
(19) P(u;) = pit (i = 1, 2; N). 


N-1 N-1 
Similarly, it may be shown that for this case 
n—1 


= 
(20) —1LN N—2 
GAjii,j= 1,2,---, WN). 


Solving (19) for p; in terms of the P(u;), 


—n n—2 
— +2— |, 


N-1 n—1 
(21) @ N) 
it should be noted that the p; are subject to the two conditions, 
(i) pi 20, for all i, 
N 
Gi) Dips =1. 


t=1 


In a particular case when the P(u;) have been assigned such that one or 
more of the inequalities P(u:)<(n—1)/(N—1) are satisfied, the cor- 
responding solutions of (21) will be negative. This restriction is rather 
severe, in general, and limits the usefulness of this method. For a small 
sample size, however, the method may be satisfactory, as will be demon- 
strated in the example that follows, and approximate solutions based 
upon it for larger sample sizes can be obtained easily. For the case 
when all solutions of (21) are positive with P(u;)=nYi/ via Yi, the 
first element would be drawn according to the set of solutions of (21) 
and equal probabilities would be used for the remaining draws. 


Sampling Scheme 2 


There are undoubtedly many other ways of defining sets of selection 
probabilities such that condition (18) will be satisfied approximately. 
To be practical, the necessary computations should remain simple, 
however. At the same time, although such schemes yield only an ap- 
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proximate solution to the preassigned “optimum” values of the P(u,), 
they should include exact expressions for the actual values of these 
quantities and for the P(uiu;) as well. The scheme proposed here 
satisfies these requirements but is restricted to samples of size 2. It may 
lead to a better approximation to the desired P(u;) when the condi- 
tion for an exact solution with Sampling Scheme 1 is not savisfied. 

The particular scheme suggested here requires the prior determina- 
tion only of the set pi,, ((=1, 2, ---, N); that is the set of selection 
probabilities to be used on the first draw. The set to be used for the 
second draw depends on the first element selected. Thus, if element u,; 
is selected on the first draw, then 


Pix 
Pi 


pi, = 0, ‘ fors =j 


fori 


defines the set of selection probebilities p;,, (¢=1, 2, - - - , N). In prac- 
tice the second element may be selected after adjusting the selection 
probabilities used on the first draw or the same set may be used 
throughout, the selection process continuing until two different ele- 
ments have been chosen. Since only thé set of selection probabilities 
for the first draw needs to be determined in advance, the subscript indi- 
cating the draw will be dropped. 

If a sample of size 2 is to be drawn using one set of selection prob- 
abilities and with replacement, then the probability that element wu; will 
be selected only once in the two draws is 2p;(1—p,;). If the conditions 
are such that sampling without replacement is not much different than 
sampling with replacement, this probability will be approximately 
equal to P(u;), the inclusion probability for sampling without replace- 
ment. This suggests that a set of selection probabilities which will 
lead to an approximation of the desired P(u;) with the prescribed 
sampling procedure may be determined from the solution to the system 
of equations 


N 
(22) 2p; — 2p; + > ¥: = 0, =1,2,---,N) 
t=1 


where, of course, the common coefficient 2 may be cancelled. Again, the 
solution must be such that 


pi = 0, for all 


and 
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N 
Dp = 1. 
i=l 


In practice, a satisfactory solution to this system may be obtained by 
solving each of the N quadratic equations separately, taking the smaller 
of the two roots. Since the selection probabilities must sum to unity, 
a simple adjustment is then made by dividing each of the p; obtained 
in this manner by their sum. This method was used in the example 
which follows where it proved quite adequate. It should be noted that 
this procedure for solving the system (22) breaks down if any of the 
desired P(u;) 24, since the solutions for the particular quadratic equa- 
tions will then be imaginary. 

The accuracy of whole method depends on the original assumption 
that a formula based on sampling with replacement will be adequate 
even though the sampling is without replacement. Although no de- 
tailed investigation has been made on this point, it appears that for NV 
at least as large as 10 and the desired P(u;) not dominated entirely by 
one or two elements reasonable success will result. 

One additional point is necessary. Whatever the set of selection 
probabilities adopted with this sampling procedure, the exact formulas 
for the P(u:) and P(uu;) are 


(23) P(u) = pi + Did ’ 
ini — Dj 
1 1 
(24) P(uu;) = Dip; + . 


Example: 

The universe to be investigated consists of 20 blocks in Ames, Iowa, 
the data being given in columns 1 to 3 in Table 2. These data are taken 
from a survey conducted by the Statistical Laboratory of Iowa State - 
College. The estimated number of househcids (column (3)) was ob- 
tained by a team of observers who drove through the portion of the 
city of Ames represented by these 20 blocks and made rapid eye- 
estimates of the number of households on each block. 

We shall consider drawing a sample of 2 blocks with probability 
proportionate to this measure of size (eye-estimated households) ac- 
cording to the two selection schemes previously developed. The exact 
values which the selection schemes are designed to achieve are listed in 
column (4) of Table 2, and the corresponding results of the two pro- 
posed selection schemes are shown in columns (6) and (8). Columns (5) 
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and (7) give the selection probabilities for the first draw with scheme 1 
and all draws with the second scheme respectively. When the sample 
has been drawn according to either of these schemes, the P(u;w;) re- 
quired in (10) to estimate the variance of 7 are easily computed from 
(20) or (24) respectively. 


TABLE 2 


Eye- 
Number estimated Selection Seiection 
Block of house- number Scheme 1 Scheme 2 
holds on of house- 
ith bleck holds on 


ith block 
(1) (2) (3) (4) (5) (6) (7) (8) 
us X; pi* P(us) Ps P(u;) 


.029 


1.0 2.0 


* The two blocks (2 and 18) with the smallest eye-estimated size were arbitrarily assigned an eye- 
estimated value of 11 households to satisfy the condition 2Y;/> sot" ¥; >1/(N —1) in obtaining these 
results. 


COMPARISONS OF EFFICIENCY (IGNORING COST) 

A primary purpose of this paper has been to extend the theory of 
finite sampling with unequal probabilities to permit unbiased estima- 
tion of the sampling error without resorting to additional assumptions. 


1 19 18 .091 040 045.091 
2 9 9 .046 003.055 022.045 
3 17 14 .071 019 035 .070 
4 14 12 .061 .008  .060 029.060 
5 21 24 .122 072.121 061 122 
ae 6 22 25 27 077 064 27 
7 27 23 066 .116 058 = 
| 35 24 .122 072.121 061.122 
9 20 17 .086 035.085 042.086 
10 15 14 019 .070 .035 .070 
11 18 18 -091 .040  .090 045 =.091 
12 37 40 157.201.108.209 
13 12 12 -061 008 .060 029.060 
14 47 30 .152 104.151 078 
15 27 27 .137 088  .136 069.138 
16 25 26 .132 082.131 067.133 
18 13 9 .046 003 .055 022.045 
19 19 19 096 045 
20 12 12 .061 86.060 
Totals 434 3942.0 1.0 2.0 
tal 
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As mentioned previously, however, it is of interest to compare the rela- 
tive effectiveness of using the supplementary quantitative variable Y 
in alternative ways, e.g. stratification, other estimators, etc. No simple 
expressions for the relative efficiency of sampling systems of the type 
herein developed to alternative systems have as yet been obtained. 
As indicative of the type of results obtainable when the relationship 
between Y and X is one of approximate proportionality, the empirical 
comparisons with a number of alternatives included in Table 3 are of 
interest. 


TABLE 3 


ariance Relative 

Method of of the efficiency 

estimator (%) 


Sampling 
Method of selection - 


Nz 16,219 100 
z. Unrestricted random. (2/9) 33,2808 497 
3. Stratified random; one ele- N# 7,873 206 
ment from each of 2 strata 

with equal probability. 

Stratified; one element with > -2,/P,* 3,934 412 
probability proportionate to 

measure of size from each of 

2 strata. 

Systematic sample; every kth NZ 10,224 

from random start. 

Midzuno; pair of elements (2/9) Diu" Y; 3,579 

with probability proportion- 

ate to the sum of the meas- 

ures for the pair. 

Scheme 1 3,095 

Scheme 2 > 2:/Pit 3,075 


Unrestricted random. 


* P; proportional to Y;. 

+ P; as given in column 6, T'able 2. 

t P; as given in column 8, Table 2. 

§ The bias for this estimator equals 1.17 which has been neglected here. 


The quantity under estimate is the total number of households on the 
20 blocks in Table 2. A sample of size 2 is considered. For this small 
universe and sample size it‘was feasible to compute the exact variance 
of the estimator employed in each sampling system directly from the 
definition, so that, for example, the mean square error of the so called 
“ratio estimate” (line 2, Table 3) is not the usual approximation. 

For the sampling systems 3 and 4 the blocks are ranked according to 
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the measure of size Y, and the ten largest blocks were taken as stratum 
1 with the remaining 10 in stratum 2. The systematic sample is also to 
be considered as drawn from the blocks after ranking from large to 
small. It is of interest to note that the “ratio estimate” used in sampling 
system 2 is identical with the estimator in Midzuno’s system (sampling 
system 6), and that it is an unbiased estimator for his method of 
selection. 

The data presented in Table 3 are, of course, far from conclusive, 
but they do indicate that substantial reductions in variance can be ob- 
tained through the use of unequal probabilities without forfeiting an 
unbiased estimate of the sampling variance. 

It is the opinion of the authors that the techniques suggested by this 
paper may be of greatest utility in specialized enquiries where the char- 
acteristics under measurement are few and related, or where selection 
with unequal probability arises naturally. The estimator (6) from a 
computational point of view is at a serious disadvantage when com- 
pared with self-weighting estimators. The estimated variance (9) has 
similar disadvantages when compared with designs that permit estima- 
tion of error by the use of an analysis of variance or other simple tech- 
nique. When an unbiased estimator of high precision and an unbiased 
sample estimate of its variance are required, however, the sampling 
system employing unequal probabilities, with the selection of two or 
more units at each stage of sampling, may be particularly appropriate. 
This is particularly true when the universe (at any stage) is small and 
the alternative use of the information in Y is a ratio-estimator (based 
on Y with equal probability selection) with its possible bias and un- 
known error. 

A modified formulation of the theory in connection with the tech- 
nique suggested by Hartley and Politz-Simmons [7] for the problem 
of the “not-at-homes” in an interview survey is possible along the lines 
suggested by this paper. It also appears that the technique of control 
beyond stratification suggested by Goodman and Kish [1] is closely 
related to the problem of the optimal assignment of the P(u,u;). 

Finally, the possibility of employing the sampling systems considered 
here in connection with “point sampling” is of considerable interest. By 
“point sampling” we have reference to the selection of farms in an 
agricultural survey by locating points at random on a map of the area 
to be surveyed, and including as sample elements the farms within 
whose boundaries the points happen to fall. (See, for example, F. Yates 
[8].) It is clear that the size of the farms will be related to their prob- 
ability of inclusion in the sample, and that unbiased estimates are 
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m possible from samples drawn in this manner. The details for this case 
to and similar cases in other types of investigations remain to be work out. 
to The authors are grateful to Dr. R. J. Jessen for kindl' ng our interest 
ig in this problem and to Professor O. Kempthorne and Dr. P. C. Tang 
> for their helpful criticisms and advice in the preparation of this paper. 
O 
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l ERRATUM: THE EFFECTIVENESS OF QUALITY 
l CONTROL CHARTS 


Leo A. Aroran AND Howarp LEVENE 
The following corrections should be made in the article published 


under the above title in this journal (Vol. 45, 1950, pp. 520-529). 
1) Page 521: 6th line from top, insert “it” between “that” and “is”. 


| 


N N-1 
2) Page 524: equation (9), replace “||” by “[]” and add “for 
t=1 i=1 


t=1, f(N) =n”. 
3) Page 525: 6th line from top, replace “m,” by “my”. 
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BOOK REVIEWS 


Mathematics of Statistics, Part Two. Second Edition. J. F. Kenney and E. S. Keep- 
ing. New York: D. Van Nostrand Company, Inc., 1951. Pp. xiii, 429. $5.50. 


Dovetas G. Coapman, University of Washington 


HIS new edition of Mathematics of Statistics, Part Two, is more than twice 

as long as the original. It can, therefore, be regarded as a new book. It is 
designed as a text for mathematics students with preparation through ad- 
vanced calculus, but none in matrix theory. This represents a reversal of the 
usual prerequisites for courses in mathematical statistics. 

The topics covered include probability, distribution theory, the laws of 
large numbers, large and small sample theory, regression (linear and curvi- 
linear), correlation (simple and multiple), analysis of variance and cevariance, 
and, finally, a chapter on statistical inference. A large number of sub-topics 
are covered under each of these main headings so that in some respects the 
book is encyclopedic. As the authors note in the preface, the choice of topics 
is to some extent a matter of personal choice; however, the almost complete 
omission of the mathematics of non-parametric and distribution-free infer- 
ence may be noted. 

The mathematics in this book is carefully done. Where proofs are omitted, 
this is stated, and no attempt at pseudo-rigor is presented. The importance 
of probability is stressed, though some may feel (as the reviewer does), that 
the authors hedge a little in presenting three different “definitions” of prob- 
ability. Their suggestion, that the axiomatic approach to probability theory 
is too difficult for undergraduate students, has been refuted by Feller’s ex- 
cellent textbook, Introduction to Probability Theory and Its Applications. 
However, the multiple approach does give the instructor a choice of using the 
definition he prefers. 

Much of the new material in the book and many of the new problems 
(there are a large number of problems at the end of each chapter) are statis- 
tical, rather than mathematical, in content. It may be noted that the title of 
this book emphasizes that it is a mathematics text. However, it will be advo- 
cated for and used by those interested in teaching or learning mathematical 
statistics. For such people, this book is not, in the opinion of the reviewer, a 
suitable text. The important conceptual bases of mathematical statistics 
have ‘been relegated to the final chapter, where limitations of space alone 
make an adequate treatment impossible. The concepts of Type 1 and Type 2 
error and the Neyman-Pearson theory of testing hypotheses are covered in 
a few brief pages of this final chapter (although Type 1 error is mentioned 
briefly and vaguely on two earlier occasions). A chapter is devoted to least 
squares, but the Markov theorem, which forms the foundation stone of the 
technique, is dismissed in part of a sentence and not even mentioned by 
name, : 


686 


i 
7 
| 
: 
’ 
A 
re : 


BOOK REVIEWS . 687 


In the chapter on the analysis of variance, the basic assumptions are care- 
fully stated and the underlying models outlined. However, the basic theorem 
on testing a linear hypothesis is omitted. Such a theorem would not only jus- 
tify the whole theory, but also prepare a student with advanced calculus to 
derive for himself all of the complex analysis of variance tests with multiple 
interactions. Moreover, the same care is not always carried inte the examples 
—for example on page 249, the interaction sum of squaves is !umped with the 
residual sum of squares, without a word of warning concerning this proce- 
dure. In example 8, pages 267 ff., the hypotheses being tested are not speci- 
fied. In fact, this example reads much like one in the standard statistical 
“cookbooks.” 

A common error appears on page 118 where error of the first kind is first 
mentioned. In dealing with tests it is stated, “It occasionally happens that 
the value of x? from the sample is unexpectedly small, corresponding to a 
probability of nearly one. In such a case, the fit is too good. . . . ” To reject the 
hypothesis or to question the assumptions on this basis is clearly to modify 
the problem or the significance level. If the hypothesis is rejected, we have a 
combination of a most powerful and a least powerful test. 

The lack of a conceptual basis to the book to a large extent makes it a 
collection of results rather than an integrated theory. The organization at 
times is rather puzzling. For example, in ten pages of Chapter IV the follow- 
ing topics are covered: cumulants, Sheppard’s corrections, orthogonal linear 
transformations, and the weak and strong laws of large numbers. From the 
point of view of a textbook, it is cluttered with far too many minutiae running 
the gamut from Poisson and Lexis schemes and the log normal distribution 
to two approaches to discriminant functions. To cover the text in a three- 
hour one-year course, as proposed, an instructor would indeed have to be 
discriminating. 

While in the reviewer’s opinion the text fails from a statistical point of 
view, it is probably the best available at its level for those whose interest is 
primarily in mathematical techniques. Moreover, because of its inclusiveness 
with respect to many of the topics covered, it should prove a useful reference 
text in any mathematical statistics course. 


Probit Analysis: A Statistical Treatment of the Sigmoid Response Curve. 
D. J. Finney (Lecturer in the Design and Analysis of Scientific Experiment, Uni- 
versity of Oxford). Foreword by F. Tattersfield (Head of the Department of 
Insecticides and Fungicides, Rothamsted Experimental Station). Second Edition. 
London and New York: Cambridge University Press, 1952. Pp. xiv, 318. $7.00. 


K. A. BRowNLEE, University of Chicago 
= first edition of this book was extensively reviewed by Margaret Merrell 
(2 pages) and Joserh Berkson (5 pages) in the 1948 volume of this 
journal, so a re-review of the original matter is not called for here. It is 
probably sufficient to remark that the first edition was everywhere ac- 
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claimed as a model of good writing. The same judgment applies to this 
edition. It is an admirably constructed book, put together with great care 
and accuracy and respect for the reader. As an authoritative survey of this 
important field of applied statistics it stands unchallenged. 

On purely technical grounds, the layman to the publishing business might 
find cause for complaint. Although the Preface to the second edition is 
dated December 1949 (and the latest reference included is to Cornfield and 
Mantel’s paper [1] in the June 1950 issue of this journal), the publication 
date is May 1952. Two years seems to be an excessively long time to require 
to issue an edition virtually unchanged except for the addition of 62 pages. 
Further, this edition is offset-litho which seems to give weak and wishy- 
washy printing. Finally, this edition costs 187% of the 1947 edition. This 
increase is more than would be accounted for by five years’ inflation. The 
price must be considered high, but since the book occupies a monopoly 
position we are obliged to tolerate this. 

This second edition has been expanded by 62 pages. Up to page 195 the 
pagination is virtually unchanged, though there are some minor modifica- 
tions in four sections. The major part of the increase, 36 pages, lies in a 
new chapter, ‘‘Recent Developments.” Table II, the Weighting Coefficient, 
is extended from 40% to 90% natural mortality by 15 more pages. Other 
additions include two new tables, two pages of new references, and the index, 
already good, has been made even better. Appendix II, the Mathematical 
Basis of the Probit Method, has been extended slightly and made somewhat 
easier by additional explanations and the insertion of more readily available 
secondary references (e.g. to Kendall’s and Cramér’s texts) in addition to 
the primary (e.g. Fisher’s 1922 paper in the Philosophical Transactions). 
The reviewer would have liked to have seen included a proof of the complete 
formula for fiducial limits and illustrations of their behavior for various 
values of the g factor, as in Irwin’s 1943 Journal of Hygiene paper. 

The new chapter, “Recent Developments,” discusses alternatives to 
maximum likelihood estimation; a problem first raised by Wadley involving 
the Poisson distribution with the equivalent of adjustments for natural 
mortality; a description of the analysis of a 2* factorial experiment in which 
parallel probit lines could be fitted; a rather brief account of the possibilities 
of incomplete replication in complex experiments; an account of Plackett 
and Hewlett’s work on the theory of independent action of mixtures of 
poisons; some considerations in the estimation of a percentage point; and 
Dixon and Mood’s up-and-down method. 

The section on the estimation of a percentage point is particularly useful 
as it takes up the obscure question which has been so often evaded, namely, 
given a reasonably reliable preliminary estimate of the LD50 and o and a 
certain number of subjects, how best to divide them into groups and at 
what points on the dosage scale. Table 37.6 gives values of Ng and Nb?V(m) 
for various experimental arrangements but omits the function in which we 
are finally interested, namely b?V(m)/(1—g) for particular values of N. 
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Two extra columns for this function for N =20 and 100 (the values which 
Finney discusses in the text) would save the reader having to make these 
calculations mentally. 

The section on the up-and-down method fails to convey to the reader the 
fact that its efficiency is in all practical usages considerably greater than the 
standard probit. method. 

In the section on alternatives to maximum likelihood estimation we enter 
the battle, grossly unequal in numbers, between the powerful hordes of 
probiticians and the tiny, possibly single-handed, band of logisticians. This 
battle has not abated but rather intensified in the five years that have 
elapsed since the first edition of this book. Incidentally, if we are going to 
have names for these things, the reviewer would suggest “blits” and 
“berkits,” while the angular transformation could be the “knit.” 

The logisticians seem to have won the first campaign, namely to get the 
logistic curve accepted as a legitimate alternative to the integrated normal. 
It has gradually been accepted that all possibilities are approximations to 
reality, so it is permissible to use whatever approximation one chooses 
providing it is close enough. The clinching point in this argument seems to 
be some conclusions by Tukey [2] that about 1600 subjects would be neeled 
to distinguish between the two curves at the 0.5% point, and 7000 at the 
15% point. This puts the possibility of discriminating experimentally between 
the two curves quite beyond our reach for almost all applications, and we 
are left free to pursue personal choice. 

The emphasis in the battle has now shifted to the question of estimation, 
the probiticians advocating maximum likelihood and the logisticians mini- 
mum x?. For a long time the maximum likelihood method basked in the 
sun of its optimal asymptotic properties, with “asymptotic” in very fine 
print (of course, minimum x? has the same asymptotic properties). Gradually 
the idea has sunk into the public consciousness that maybe this does not — 
carry over to finite, and in particular small, samples. Thus this second 
edition contains towards the end of Appendix II, on page 250, the proviso 
“These results are, strictly, applicable only for large samples, and research 
into the appropriate formulae for small samples is needed.” This remark 
was lacking in the first edition. Currently, apparently, theory gives no help. 
Berkson’s preliminary reports [3, 4] of his sampling investigations suggest 
that minimum x? might be better, in the sense of having a smaller error 
mean square, than maximum likelihood. Clearly, there is here the possibility 
that the last citadel of the probiticians may be stormed. 

It seems to the reviewer that Finney is hardly fair to the short-cut meth- 
ods. Eisenberg’s [5] comparison with the Litchfield-Wilcoxon nomographic 
approximation showed that for 52 assays the differences in the ED50 were 
very small compared with the standard error. Similarly, the differences in 
the 95% fiducial limits were small compared with the error in their estima- 
tion. 

Finney also seems to the reviewer to give inadequate credit to Karber’s 
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method. The second edition adds hardly anything to the rather deprecatory 
discussion in the first edition, though in the Preface Finney does say he 
would have liked to discuss it further. From the theoretical point of view 
Cornfield and Mantel [1] have shown that it is much more respectable than 
might have been thought. In fact, for certain reasonable conditions it pro- 
vides a maximum likelihood estimate. From the practical point of view, 
Berkson [6] and Armitage and Allen [7] have shown t’1e divergences to be 
negligible for all practical purposes. 

Similarly, Armitage and Allen [7] found that for 12 typical assays in 
the literature Berkson’s approximation to minimum x? for the logistic 
(which does not require iteration) gave as good a fit, as measured by 
x?=Sn(p-P)?/PQ, as the maximum likelihood solutions for the integrated 
normal. The sums of the x?, with 62 degrees of freedom, were 47.8 and 51.1 
respectively. Clearly the Berkson approximation need not hang its head. 
The exact minimum x? was even better, of course, giving a x? of 43.0. 

Timewise, the probit method is expensive in the extreme. Berkson [8] 
reports that 30 approximate minimum x? analyses can be made in the 
time required for one probit analysis. In view of this, one wonders why 
people persist in probitistics when they achieve no advantage practicaily 
and probably none theoretically. 

The reader will gather from the above discussion that the reviewer feels 
that Finney takes a somewhat narrow view of certain aspects of this subject. 
The reviewer feels that this is unfortunate as it will enable probitistics to 
continue to occupy a position of absolute authority to which it is not er- 
titled. The reviewer suspects that the third edition of this excellent bovx, 
when it comes to be written, will contain significant retreats, perhaps even 
to the point of the disappearance of the word probit from the title. Mean- 
time, however, he can recommend the second edition to all those who need 
to handle quantal data in their work, to all those who are interested in applied 
statistics in general, and to all those who delight in fine writing. 
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Quality Control and Industrial Statistics. Acheson J. Duncan (Associate Profes- 
sor of Statistics, Johns Hopkins University). Chicago: Richard D. Irwin, Inc., 
1952. Pp. xxvii, 400. $8.00. 


BERNARD P. Duppina, The General Electric Company, Ltd. (England) 


HIs is an ambitious book, the title of which is not suited te its contents. 

As stated in the Preface it is designed as a text-book for engineering and 
business students and although the mathematical proofs of some statistical 
relationships are relegated to Appendix I, the book covers a wide field of 
statistical techniques. The subject of Control Charts dealt with in Part IV 
occupies rather less than one hundred pages whilst over two hundred 
pages in Part V are devoted to more advanced statistical methods. The 
serious student of statistics will find the selected mathematical proofs useful 
and all readers will be grateful for the summaries of selected references 
given at the end of each sub-section, which add greatly to its value as a book 
of reference. The absence of references to many valuable British publications 
will limit its usefulness to British readers. 

The curricula of studies for engineering and business students are already 
so large that few will find time to master the many statistical techniques 
described. The engineer and busy executive are often critical of the algebraic 
formulation of statistical procedures and their attitude will be fortified by 
reference to Appendix III. Here, five pages are devoted to tabulation of 
the symbols used in the book and it will be noted that the same symbol is 
sometimes used for different purposes. If the reader is familiar with other 
American text-books and the A.S.T.M. Manual on this subject, he will also 
find that the various authors use different symbols to denote the same fea- 
ture of statistical nomenclature. The Greek and Latin alphabets using small 
and capital letters with subscripts appears to be quite inadequate to the 
task of expounding statistical principles. The need for international agree- 
ment in this respect is great but will continue t be a remote ideal until 
authors in the same country establish standardisation of symbols. 

From the point of view of the general reader interested in the use of 
statistics as ‘an industrial tool the author has established his claim to having 
written a readable book, and apart from the reference to the American game 
of Craps, British readers will also agree with this comment. The book 
contains references to special procedures including a reference to modified 
control limits which the writer of this review believes he first introduced, 
coupled with a simple procedure to determine when such limit: are appli- 
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cable.* There are occasional examples of small sections in the book which 
suggest lack of contact with practical conditions, for example, the suggestion 
to construct control charts for standard deviation for samples containing 
more than twelve specitsus when the range becomes unreliable. It is ex- 
tremely doubtful wheter the standard deviation would be calculated for 
the purposes of a control chart. ‘The general reader would probably like more 
definite guidance concerning which sections of each Part should first be read, 
particularly if he were uvable to attend a regular course but wished to lay 
the foundations of an understanding of statistical principles and to initiate 
routine application of them in the normal course of his industrial work. 

Used as a text-book at a college or university the book will find favour 
because any necessary selection of subject matter to suit different types of 
students could no doubt be made by the lecturer or demonstrator, provided 
the latter had had industrial experience to assist him in making the choice. 
Whether or not students who have used this book during a course are 
successful later in life in applying this knowledge will depend largely on 
whether they have had opportunity to do statistical work under industrial 
conditions during vacations. 

There is little doubt that the principal problems which confront the user 
of statistical techniques in industry frequently arise from the fact that the 
available data are not always in a form which makes it readily possible to 
follow established procedures. Hence, the duty of persons practising under 
industrial conditions is to adjust the procedures to meet every-day require- 
ments, such adjustment demanding a good appreciation of the statistical, 
technical, and economic problems involved. Success cannot be assured by 
following any prescribed course or reading particular text-books, but can 
be achieved by one whose training in the three necessary fields is sound and 
broadly based. This book will serve as an aid to this end in the field of 
statistics. 


ASTM Manual on Quality Control of Materials. Special Technical Publication 15-C. 
Philadelphia: American Society for Testing Materials, 1951. Pp. xiv, 127. $1.75. 
Paper. 

Harry M. Hueues, University of California (Berkeley) 


RE is & concise manual on the presentation of data—the first major re- 

vision of the ASTM Manual on Presentation of Data, whose main section 
and two supplements have been revised to form respectively Parts 1, 2, and 3 
of the present work. It was prepared by Committee E-11 (Harold F. Dodge, 
Chairman). The subject matter is limited to a consideration of single-sam- 
pling, single-variable problems, thereby avoiding double- or multiple-sam- 
pling and regression or correlation studies. The statements throughout are 


* Quality Control Chart Techniques When Manufacturing to a Specijication, by B. P. Dudding, 
M.B.E., Ph.D., and W. J. Jennett, B.Sc., (Eng.). Research Laboratories, The General Electric Co., 
Ltd., England, August 1944. 
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simple and concise, with an illustrative example for each point. Duplication 
in the various parts is eliminated by an excellent system of cross-referencing 
whenever a subject arises that is treated elsewhere in the manual. A Glossary 
appears at the end of each part, including a glossary of symbols for control 
charts in Part 3. 

In Part 1 are outlined several presentations of frequency distributions, 
measures of central tendency, measures of dispersion, and the skewness 
coefficient, with detailed suggestions for easy computing. It is then made 
clear how informative each of the statistics may be, alone or in combina- 
tions. The additional information to be gained from evidence or knowledge 
that data were obtained under “controlled conditions” is indicated also. For 
the exact meaning of “controlled conditions,” however, the reader is referred 
to a discussion of Shewhart’s, thus leaving it unclear what information should 
be presented to support the claim that data were obtained under controlled 
conditions. In summary, a recommendation is made for the presentation of 
data both at the beginning and end of Part 1. 

In Part 2 is discussed the problem of an interval estimate of the popula- 
tion mean based on the sample mean, including method of computation, 
presentation and interpretation of such confidence intervals. Constants for 
95 per cent confidence intervals have been added to the previously available 
90 and 99 per cent confidence tables. 

In Part 3, which treats the control chart method of analysis and presenta- 
tion of data, this manual presents in addition to the previous Supplement B 
naterial a discussion of control charts for individuals and moving ranges of 
two, as well as a separate treatment of “number of defectives,” “number of 
defects,” and “number of defects per unit” charts. All are considered both 
with and without a given standard and a complete numerical example is pre- 
sented for each of the twenty-five sets of conditions. The glossary of sym- 
bols might well have indicated that the letter “c” is used as an acceptance 
number in acceptance sampling, since that field is likely to be encountered by 
a person using control charts. Mathematical formulas pertinent to the fac- 
tors used for control limits are presented and discussed accurately in a sup- 
plement to Part 3. (Note that the radical in formula (B1), page 110, should 
not extend over either of the factorials; also formula (B4) should have cg on 
the left of the equality sign.) There follows also a complete table of control 
chart factors on page 115, and a table of squares and square roots of numbers 
to 2000. Squares are given exactly and roots are given to four decimal places. 
The table of all control chart factors on page 115 is very convenient; it would 
be even more useful if it were to indicate which columns apply to “standard 
given” and which to “standard not given.” The factors B,, Bz, Bs and By, 
which depend on the sampling variance of ¢, appear for the first time in ex- 
act form, giving more accurate values for small sample sizes and avoiding 
some previous contradictions. 

Three minor changes might add to the already excellent clarity in future 
editions. On page 14 formula (5) there is a misprint of X? instead of X* under 
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the first radical. The two paragraphs on total information at the start of 
page 21 could use a clarifying statement to the effect that order is to be con- 
sidered in Part 3, the Shewhart total information concept being used in Part 
1. And finally, on page 115, Note 3 might well contain a reference to the ex- 
ception in Note 7 lest the reader stop at Note 3 with an incorrect impression 
of the accuracy of parts of table B2. 

The minor character of the suggested changes reflects the general excel- 
lence of the manual for the purposes intended. 


Conference on Business Cycles, National Bureau of Economic Research, Inc. 
New York: National Bureau of Economic Research, Inc., 1951. Pp. xii, 433, 
$6.00. 


Grorce Garvy, Federal Reserve Bank of New York 


NTIL the publication of the present volume, the surprisingly small num- 

ber of textbooks on business cycles has been the most striking manifes- 
tation of the lack of a common approach and of an undisputed body of 
basic knowledge and conclusions. Now we have, between two covers, a 
sample of the basic cleavage among contemporary economists in the ap- 
proach to business cycle research. Fortunately, the sample is fairly represent- 
ative, the respective positions are clearly discernible, and the empirical stud- 
ies reported are important in scope and suggestive in their methods and 
findings. 

The present volume is a collection of papers presented at a Conference on 
Business Cycles held in November 1949. It is neither a systematic review 
of the progress made in business cycle research since, say, the outbreak of the 
war, nor an attempt to stake out new fields for research (such as the pre- 
ceding volume in the “Special Conference Series” which was devoted to 
Problems in the Study of Economic Growth). Each of the papers is followed by 
comments. Many distinguished discussants raise important points which 
transcend the topical limitations of the papers to which their comments are 
addressed (such as Angell’s and Leontief’s comments on Klein’s paper). 

While the papers are grouped under three headings—(1) General Papers, 
(2) Profits, Investment, and Business Cycles, and (3) Business Cycle Re- 
search and Policy—their only common denominator is the fact that they 
were given at the same conference. It is, indeed, interesting to speculate 
whether the time at which the conference was planned accounted for its 
lack of focus. For many years, the problems of a war economy and of the 
subsequent period of reconversion had deflected the attention of many 
research workers from cyclical fluctuations. Worst of all, the sacrilegious 
thought began to creep into the thinking of some of us that, perhaps, 
business cycles belong to history. The events of the three years since the 
conference will undoubtedly give new stimulus to business cycle research 
and orient it increasingly towards problems discussed in Part Three of the 
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volume, which includes three important exploratory essays (by Ashley 
Wright, Haberler, and Smithies) on business cycles and policy. 

Although the conference was sponsored by the National Bureau of Eco- 
nomic Research, only three papers are contributed by the regular staff of 
the sponsoring institution: Arther F. Burns’ brief review of Mitchell’s 
What Happens during Busines Cycles: A Progress Report (published in the 
meantime as introduction to that volume), Moses Abramovitz’s summary 
of Part III of his Inventories and Business Cycles, and Thor Hultgren’s 
three page summary of his Cyclical Diversities in the Futures of Industrial 
Corporations; the last two studies have also been published since the con- 
ference. 

The short papers by Burns, Tinbergen, and Schumpeter in Part One 
exemplify three important approaches to business cycle research. They 
neither contain much that has not already been said by their authors, nor 
do they attempt to contrast systematically alternative types of business 
eycle research. Tinbergen’s short “Reforriulation of Current Business Cycle 
Theories as Refutable Hypotheses” contains what is, perhaps, the most 
absolute claim for the superiority of the econometric approach. He states 
categorically: “Only the mathematical theories give a complete list of the 
phenomena included and of the causal connections accepted as existing 
among them.” And, unperturbed by the misfortunes of Klein’s model under 
Christ’s scalpel, Koopmans, in commenting on Tinbergen’s paper, finds 
consolation in the thought that the tower from which to look beyond an 
unscalable wall is just not high enough yet. 

In contrast to the econometricians’ quest for systematic and stable rela- 


- tionships, Schumpeter (“Historical Approach to the Analysis of Business 


Cycles”) pleads for a study of forces which make for changes over time (in 
form and position) of consumption and production functions. For him, the 
dynamic process is not one of dated quantities but of revolutionizing quali- 
ties. 

Carl Christ’s paper “A Test of cn Econometric Model for the United 
States, 1921-47” is, perhaps, the most interesting in Part One for the 
readers of this Journal. It is a careful examination of the forecasting value 
of Klein’s well known econometric explorations. Christ’s verdict is that 
predictions for 1948 based on Klein’s model (as modified by Andrew W. 
Marshall) are no better than predictions made by “naive” models which 
simply extrapolate either the value of each variable from the preceding year 
or the trend between the preceding years. 

This verdict, Klein is unwilling to accept. His own contribution—one of 
the two principal papers in the second part—suggests, however, that the 
econometricians’ promise of bigger and better models would sound more 
convincing if there were more evidence of their readiness to accept sympa- 
thetic criticism. Indeed, Klein’s “Studies in Investment Behavior,” which in 
the main presents a cross-sectional analysis of railroad investment, arrives 
at few significant and definitive conclusions. The first two, that “there can 
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be no doubt about the influence of profit on investment” and that “lower 
interest charges stimulate investment” are, while not new, hardly subject 
to controversy. The third conclusion, that the stock of capital appears to 
have a depressing effect on investment, is very doubtful even in the light of 
the author’s analysis. Klein’s claim that “the systematic role of other 
variables in railway investment decisions remains dubious” is most disturb- 
ing, however. As Angell points out in his comments, Klein has not attempted 
to test the exclusiveness of his hypothesis. While the explanation of one 
variable by another might be plausible on theoretical and statistical grounds, 
it still might be that some other equally plausible variable provides as 
good—or a better—fit. 

Among the new hypotheses of a more general importance advanced in 
the volume, this reviewer finds Gordon’s related concepts of incomplete 
and overlapping cycles most intriguing. Gordon’s working hypothesis for 
the study of the investment boom of the ’twenties involves situations where 
investment stimuli primarily responsible for an upswing may come to be 
superseded by a new and stronger set of stimuli (one cycle following upon 
an incomplete portion of another), or when accumulation of strength arising 
from new stimuli occurs during the contraction phase of the preceding 
cycle (overlapping cycles). Gordon’s carefully documented and cautiously 
worded analysis contrasts with the wide vistas and definitive statements 
of some of the other contributions. This reviewer has a hunch, however, that 
a student reaching for this volume in 1962 will be looking for the former 
rather than for the latter. 

This volume suggests that the conflict between the econometric approach 
to business cycle research and the older schools is not in the use of statistical 
information (or, if econometricians had heeded Marschak’s advice, of “all 
other available information,” in addition to time series), but in the place 
assigned to quantitative elements in the analysis. The premise of model 
builders is that all significant forces can be quantified and that all variables 
identified as relevant for the period of observation will continue to be signifi- 
cant in the future. There is no place in econometric models for noncon- 
tinuous, unique forces which may be relevaut, or even decisive, for shaping 
one cyclical phase, but which dwindle or vanish subsequently. 

To this, an econometrician will reply that any variable may assume the 
value zero. It remains, nevertheless, true that the introduction of many 
additional variables which assume the value zero except for relatively short 
periods not only presents serious problems of estimation, but also raises tue 
question of quantification of such influences and the identification of the 
nature of their relationship to other variables. 

Gordon’s main conclusion from the study of the ’twenties that “the 
salient feature is the magnitude and extent of the structural changes imposed 
on the American economy in the short span of a decade” (p. 210), coupled 
with Kuznets’ warning against the foreshortening of the time perspective 
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of analysis because of the deficiency of data for longer periods, are valuable 
guideposts for a reorientation of business cycle research. It is to be hoped 
that subsequent conierences on business cycle research (and it is certainly 
desirable to give this important branch of economic research a stimulus 
similar to that provided in a related field by the Conferences on Income and 
Wealth) will focus, among other things, on the relevance of structural 
changes for the variations in the intensity and pervasiveness of business 
cycles and on the relation of cyclical fluctuations to .ae process of economic 
growth. 


Measures of Business Change. Arthur H. Cole, with the assistance of Virginia 
Jenness and Grace V. Lindfors. Chicago: Richard D. Irwin, Inc., 1952. Pp. xxi, 
444, $7.50. 


Morris B. Utuman, U.S. Bureau of the Census 


NFORMATION on approximately 450 national and regional indicators of 
| business change have been assembled here in a single volume for the use 
of an audience defined broadly as “analysts of business conditions, pri- 
marily associated with private business institutions, students of business 
cycles in colleges and universities of the country, and reference librarians 
connected with academic and public libraries.” The information presented 
includes the compiler, frequency of current publication, location of current 
and historical data, period covered by the indicators and a brief description, 
which sometimes gives the method of obtaining the index, the base for 
indexes, the components available and similar information. It should be 
noted that the descriptions are for the most part very brief and vary con- 
siderably as to the amount of information included. 

In the Preface, Dr. Cole indicates this volume was started as an attempt 
to bring up to date Davenport’s Index to Business Indexes, but the resurvey 
preliminary to the revision led into other directions and the idea that this 
volume be a revision of the earlier book was abandoned. 

In comparing the present volume with Dr. Davenport’s earlier book, Dr. 
Cole points out that not only were the materials brought up to date, but 
also that a “broader net” was used to obtain items for inclusion. A broader 
definition of the term “index” was also used. “I have chosen to define the 
term ‘index,’ ” Dr. Cole says in the Preface, “to embrace various types of 
relationships—not merely relationships of successive quantities or values to 
a selected base, but also percentages of successive quantities or values to a 
known or estimated aggregate . . . and likewise composite values....” A 
more considerable effort was also made to cover governmental series, al- 
though the greater emphasis is on private materials. Sources for historical 
data have also been included and the special section on regional indexes, 


- which is approximately one-third of the book, has been added. 


On the whole, this book should prove very useful as a starting point in the 
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location and use of business indicators. The inclusion of secondary material 
among the sources, such as the Supplements to the Survey of Current Business 
and the Economic Almanac, should simplify the finding problem, since 
such sources are often more readily available than the references which 
include the original presentation of the data. The selection of the material 
and the descriptive notes are uncritical, leaving to the user that technical 
study which is necessary for the determination of the appropriateness of the 
indicators. 

As it stands, this book is a good example of the coordination of effort of the 
librarian and the technician, where the librarian furnishes the beginnings of 
research by making known to the technician what materials are available, 
To go any further in this direction would have involved the authors in a 
different type of undertaking—one that would have called for the services 
of data specialists. Failure to recognize the practical limits of the type 
imposed would probably have extended the job to the point where this 
volume would not have been finished and published. 

This book is thus a distinct contribution which the statistician should 
find useful either as a device for recalling memory or as a guide to material 
outside the field of one’s immediate specialization. 


Cahiers du Séminaire d’Econométrie, No. 1. Edited by René Roy. Paris: Librairie 
de Medicis, 1951. Pp. 122. Paper. 


Rosert Sotow, Massachusetts Institute of Technology 


HIs small volume contains five papers which were presented and discussed 

at a seminar of the Centre National de la Recherche Scientifique, under 
Professor Roy’s direction. It appears to be the first of an annual series that 
will be a welcome addition to the literature, providing samples of the work 
of some leading younger members of the French school of economists and 
econometricians. 

The first paper, by G. Th. Guilbaud, is a discussion from first principles of 
the old problem of the decomposition of time series into additive (or, which 
comes to the same thing, multiplicative) components, to be labelled trend, 
seasonal, cycle, and residual. Most of the discussion is in terms of a periodic 
component of known period, with seasonal variation as the obvious example, 
and this is all to the good. More and more work of high quality is being done 
on methods for the study of movements of business-cycle type. We shall soon 
be reduced to gaps in economic theory, the rapidity of changes in parameters, 
and the poor bearing-power of economic data as our sole excuses for not un- 
derstanding better the structure of economic time series. But the analysis 
and “elimination” of seasonals is still afflicted with a sort of numerology 
which seems to bear no relation to the standard canons of statistical desira- 
bility. I think I remember Harold Hotelling writing somewhere that the only 
trouble with most of the tricks of the seasonal trade is that they should never 
be used. 
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To this sort of thing Guilbaud’s paper is a lucid and excellent corrective. 
The reader is reminded several times that some model of the structure of the 
series underlies every statistical technique, and that the choice of a model 
should not be made solely on the grounds of statistical convenience. In the 
simplest case a time series is viewed as the sum of a strictly periodic seasonal 
and a random residual. Guilbaud’s main tool is the so-called Buys-Ballot 
table, familiar from periodogram analysis. At first this is treated as simply a 
convenient way to write down the data; the column averages, for fixed 
months, provide a grip on the seasonal; row averages, taken over a single 
year, give indications of a trend. It is then pointed out that the set of 
monthly means is a least-squares estimate of the seasonal, and under suitable 
assumptions about the random residual also a maximum-likelihood estimate. 
The adoption of least squares as a criterion of goodness of fit permits the rep- 
resentation of the procedure as the orthogonal projection of a point in N- 
space onto the subspace of strictly periodic time series. N is here the length 
of the time series. This geometry is very clearly and nicely exploited in the 
course of the exposition. One other useful by-product is a warning that it is 
not proper to eliminate a seasonal and trend in two separate operations. If 
the orthogonality is not to be lost, the elimination must be simultaneous. 
The standard assumptions of additivity, normality, and independence lead 
naturally to the formulation of the whole technique as a simple one-way 
analysis of variance. 

Finally this method is looked at as a filtering mechanism. One asks what 
kind of time-series components get caught in such a net, and what kinds pass 
through. This general point of view is extended briefly to the consideration 
of more complicated kinds of stationary time series, including autoregressive 
schemes. It would be interesting to have a more complete description of 
Guilbaud’s approach. 

Another interesting paper, by M. Boiteux, is concerned with the possibil- 
ity of pricing at m: inal costs in cases where the level of demand fluctuates 
at random. At least some of the problems connected with the definition of 
marginal costs evaporate if we consider a situation in which demand at the 
going price is s’: ole and constant at a level to which equipment is perfectly 
adapted. Lor... .aarginal cost then includes all additional expenses made 
necessary by «# ial inerease in the level of demand to a new (also stable 
and constant) a:aount, including the equipment costs required to adapt plant 
to the new situation. But what is to be done when demand is subject to ran- 
dom ups and downs? What about the one or two empty seats in the train 
just before it is due to depart? Boiteux argues that in this case it is not the 
individual sale, the individual railroad ticket, or kilowatt-hour of electricity, 
that is the proper object of pricing decisions. Instead he would say that what 
is purchased is the right to exercise a frequency distribution of demands, at 
given prices, and it is the characteristics of the distribution that matter. 

In the simplest case, if individual demands are statistically independent 
and satisfy the conditions for the Central Limit Theorem, aggregate demand 
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at the going price will be approximately Gaussian. If the enterprise fixes g 
margin of security, that is, sets the probability that it will be forced to turn 
customers away unsatisfied, a certain plant capacity will be required which 
depends only on the mean and standard deviation of aggregate demand. 
Each customer is to be thought of as purchasing the right to exercise a de- 
mand with a certain mean and standard deviation; given the assumed inde- 
pendence these are the only characteristics which matter. Marginal costs 
then apply not to the purchase of a single item more or less, but to a virtual 
and permanent change in the mean or standard deviation of a customer’s 
distribution. It is simple to calculate the effect this will have on the aggre- 
gate mean or standard deviation and hence on the plant capacity needed to 
maintain the fixed probability of default. These additional plant costs are 
properly included in long run marginal costs. This explains why annual sub- 
scriptions often cost less than twelve monthly issues. A special treatment is 
given to customers who are willing to come tw the station, see if an empty 
seat develops, and travel only if there should happen to be one. 

The remaining papers can be given only brief mention here. C. Fourgeaud 
applies the weighted regression technique of Tintner, variate difference 
method and all, to obtain a supply and demand curve for textiles in France 
over the period 1920-1938. The statistical assumption underlying this tech- 
nique is that exact linear relations hold among the “systematic parts” of the 
variables involved, and these are obscured by errors of observation and such- 
like. Most contemporary econometricians prefer not to adopt this specifica- 
tion; it is difficult to judge the value of the results. R. Henon contributes a 
discussion of the pure theory of replacement and amortization policy. The 
first part of the paper considers a future known with certainty, making a 
detour over the “internal rate of return.” The second part permits the useful 
life of equipment to be a chance variable with known distribution; there is an 
interesting discussion somewhat along the lines of the actuarial theory of | 
risk. A short note by A. Nataf discusses some earlier studies of the elasticity 
of the American demand for imports, and concludes the volume. 


Corporate Income Retention, 1915-43. Sergei P. Dobrovolsky. New York: Na- 
tional Bureau of Economic Research, 1951. Pp. xviii, 144. $2.50. 


AuBerT R. Kocu, Board of Governors of the Federal Reserve System 


HE recent publication of the National Bureau of Economic Research’s 

Financial Research Program entitled Corporate Income Retention, 1915- 
43, by Sergei P. Dobrovolsky, is a detailed statistical monograph dealing 
with the policies of manufacturing corporations regarding the distribution 
of their profits. It is a short book, closely reasoned, and full of statistics 
and statistical measurements. It contains interesting and thought-provoking 
conclusions for the general economist and statistician, but one wishes that 
the author had broken the almost uninterrupted flow of multiple correla- 
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tions and regression coefficients with more theoretical analysis, history of 
the period, discussion of the institutional framework, and economic signifi- 
cance of the findings. A more effective method of presentation might have 
been to use a selection of the statistics and statistical techniques involved 
as illustrative of the facts accumulated by the author rather than to conduct 
the reader over the tortuous and at times tedious journey through the com- 
plete array of evidence. 

Among the author’s main conclusions are the following: 

1. Manufacturing corporations in 1915-43 began to save (retain income) 
when profits approached 5 per cent of net worth, and thereafter the propor- 
tion of income saved increased with profits, although at a decreasing rate. 
Although this average propensity to save varied with profits, however, the 
marginal propensity surprisingly remained the same at all levels of income, 
namely at 20 to 30 cents for each additional dollar of profits. 

2. Among the most important factors that affected the income retention 
of manufacturing companies during the period covered by the study were 
the level of income, dividend requirements, and physical asset expansion. 
The reserve position of corporations appeared to have little effect on income 
retention. 

3. There has been no evidence of an increasing importance of internal 
relative to external financing of manufacturing corporations in recent 
decades, although external financing has increased in relative importance 
in periods when assets have expanded rapidly. This conclusion substantiates 
a similar one tentatively reached in earlier National Bureau publications of 
its Business Finance Project. 

There is little room in a short review to discuss meaningfully many of the 
specific points on corporate income retention raised by Dr. Dobrovolsky. 
One such specific point concerns the guestion as to whether a company’s 
reserve position affects its income retention policy. The author dismisses 
this possible effect when he finds that the income retention of manufacturing 
corporations in the individual years from 1915 through 1943 was not con- 
sistently and negatively related to surplus at the end of the previous year. 
Perhaps “liquidity” rather than “reserve” position would have been a more 
relevant factor to consider as a factor affecting income retention, for surplus 
might very well represent funds already invested in bricks and mortar 
rather than liquid resources available for dividend disbursement. There is 
no specific and constant relationship between the surplus of a business 
corporation (as shown on the liability side of its balance sheet) and its 
available liquid funds (as shown on the asset side). 

A caution to the general reader is also perhaps in order. Dr. Dobrovolsky 
bases his work mainly on the financial statistics of manufacturing corpora- 
tions collected in the course of the National Bureau’s recent Business Fi- 
nance Project. He uses data on all such corporations; samples of 31, 45, and 
70 large corporations (the smaller samples presumably being included in 
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the larger one); and different samples of 73 and 381 small and medium-sized 
corporations. In view of the small size of these samples and the special char- 
acter of the samples of small and medium-sized corporations (the 73- 
company sample covers only Wisconsin corporations and the 381-company 
one, only corporations in five quite narrowly-defined industries), Dr. 
Dobrovolsky might well have hedged somewhat more his conclusions on 
the differential behavior of large and small corporations with respeci to in- 
come retention. 

All this is not to deny the usefulness of monographs on business finance 
such as this to the economist. The effect on economic activity of the financial 
status and operations of businesses, to say nothing of those of consumers 
and other sectors of the economy, have been underplayed in general economic 
analyses far too long. Study of works like this on specific aspects of financial 
policy and behavior will do much to overcome this analytical shortcoming. 


Rural Levels of Living in Lee and Jones Counties, Mississippi, 1945, and a 
Comparison of Two Methods of Data Collection. Barbara B. Reagan and Evelyn 
Grossman. Washington: U.S.D.A. Agricultural Information Bulletin 41, October 
1951. Pp. vii, 164. $.40 from Superintendent of Documents, Washington 25, 
D. C. 


C. Horace Hamiuton, North Carolina State College 


HIS monograph is of both substantive and methodological interest. It is 

a report of a field study designed: (1) to describe the levels of living and 
consumption patterns of rural farm and rural nonfarm families (more specifi- 
cally “consumer units”) in southern industrialized areas; (2) to compare the 
levels of living of rural farm and rural nonfarm consumer units; and (3) to 
appraise the use of a “split” schedule in an enumerative survey. 

Lee and Jones Counties, Mississippi, were “purposefully” selected to 
represent rural industrialized areas. Within the counties random areas were 
selected with the areas defined by the Master Sample of Agriculture. Alto- 
gether 1,191 families and single consumers were interviewed. 

Two major schedules were used. The longest schedule (36 bulletin pages) 
covers the social and economic characteristics of the consumer units, popula- 
tion data, and, principally, a detailed set of income and expenditure ques- 
tions, including food purchased and home produced. Each interviewer was 
asked to report income, expenditures, and related items for an entire year. 
Consequently, the accuracy and completeness of the data were considerably 
affected by bias and error due to memory. Schedules were accepted as 
satisfactory in the case of farm families, if total money receipts and money 
disbursements balanced within 10.5 per cent; and 5.5 per cent in the case 
of nonfarm families. These standards of accuracy have been accepted for 
many years and are useful as a practical device for increasing the efficiency 
of interviewers. Nevertheless, all surveys of this character are suspect. The 
only alternative seems to be the use of household record books kept under 
close supervision and by means of frequent visits of interviewers. This 
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procedure is especially to be recommended in the cases of food expenditures 
and of other items that may be easily forgotten. 

This study was considerably strengthened by the use of a second major 
schedule designed to determine the amount and type of food consumed 
during the week preceding the survey visit. Unfortunately, the week pre- 
ceding the date of each interview varied because the survey required about 
twelve weeks’ time for completion. The authors were careful to show exectly 
the dates of collection of food consumption data and to evaluate the error 
involved. 

In method, the most noteworthy contribution of this study is the experi- 
mentation with a “split-schedule” technique designed to shorten each inter- 
view. For control or comparison purposes, the complete long schedule cover- 
ing income and expenditures was used with about one-fourth of the families. 
Then three splits or parts of the long schedule were prepared and each was 
used with a fourth of the families. This technique, as the authors point out: 
(1) requires a much larger sample for equivalent reliability; (2) actually 
increases over-all interview time for the same amount of data, principally 
because of the extra travel and related time wastage; and (3) decreases the 
usefulness of the data collected because of the limitation put on the amount 
of cross tabulation and inter-correlating of items from one part of the 
schedule to the other. 

The authors summarize their experience with the split-schedule technique 
as follows: 


“It was found that the split-schedule technique was open to considerable field 
error. It required a largersample than did a complete schedule; it increased travel 
and supervising costs. All in all, the split schedule was found to be a relatively 
a procedure. The interview time for a particular family, however, was 
reduced in comparison with time spent when a complete schedule was used. 
The types of analysis possible when the s” ‘t schedule is used are somewhat lim- 
ited. The relationships between items on the various schedules cannot be studied 
in any detail. This was not a serious limitation in the analysis planned for this 
report. 

“The experience gained from this survey would indicate that the split- 
schedule technique ee should not be attempted in a survey of a hetero- 
geneous population especially if interrelationships of several factors are to be 
studied. The experimental use of the method for farm families in Lee and Jones 
Counties, Miss., was undoubtedly made under more favorable conditions than 
those of many surveys in that the population studied was relatively homo- 
geneous.” (Page 3.) 


Factually, this report shows that the average combined time for conduct- 
ing the interviews with the three different split schedules was four hours and 
forty-five minutes, as compared with only three hours for the complete 
schedule. 

The authors also show that their sample was too small to answer some 
questions about the effect of the split-schedule tecanique on quality of data. 
Unfortunately, it is pointed out, the split-schedule technique makes it im- 
possible to use the balance of.expenditures against receipts as a method of 
checking accuracy. 

Although the authors are careful not to be dogmatic in their conclusions, 
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it is obvious that after making this interesting experiment they are perhaps 
somewhat “sadder and wiser” about the usefulness of the split-schedule 
technique in general. Certainly, after carefully studying their results and 
comments, this reviewer feels that we should look elsewhere for methods of 
shortening interviews, cutting costs, and improving the quality of field 
enumeration data. (A national or state census is another matter.) It might be 
more profitable to interview the same families three times during a year 
than to visit three (or more) times as many with split schedules. 

A minor methodological experiment in this study involved the use of 
“ownership of a cow” as an indicator that a consumer unit, cultivating less 
than three acres, produced enough food ($250 worth) to be classed as a 
farm. If a family cultivating less than three acres did have a cow, it was 
arbitrarily classed as farm; but, if it did not have a cow, other questions 
were asked regarding value of farm products raised. The use of “cow owner- 
ship” as an indicator increased substantially the number of units classed 
as farm; but it rarely omitied units which would be classed as farm by the 
1945 Census definition. Factually, it was found that only 60 per cent of the 
units classified as farm by the “cow” criterion had enough home produced 
food and cash sale of farm products to total $250 or more. Although a 
number of units classified as farm by the “cow” criterion were not farms in 
the sense of the 1945 Census definition, the general results of the study, the 
authors show, were not markedly different from what would have been 
obtained had the Census definition been strictly adhered to. 

An important discovery of this study is that highly significant and inter- 
esting differences in levels of living were found in comparing (1) farm con- 
sumer units selling at least $200 worth of farm products with (2) farm con- 
sumer units with little or no sales of farm products. The latter group had 
better nutritional standards, higher incomes, and higher levels of living than 
the first group. And, surprisingly, the farm consumer units with little or no 
sales fed themselves better than even the rural-nonfarm consumer units 
who, otherwise, had higher levels of living. The implied conclusion here is 
that working in town (rural industry) and living in the country is a good way 
of life. Industrialization of rural areas, the authors state, is increasing the 
number of farm families who raise food for home use, but who sell little in 
the market. If this trend keeps up, there will be even more pressure for 
revising the very much battered Census definition of a farm. 

There is no space here for appraising the many interesting substantive 
findings of this study. A more popularly written report of this highly statisti- 
cal and encyclopedic study is needed for rural educators and others who 
otherwise will be befogged, frustrated, if not disgusted with having to wade 
through so much tabular matter, methodological description, and statistical 
analysis. Of the 164 pages in the report, 44 are used to reproduce the sched- 
ules, and 62 are used for the tabular summary. As it is, the report will be 
primarily useful to other statisticians and research workers in home eco- 
nomics. 
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The ‘‘Prestige Papers,” A Survey of Their Editorials. Jthiel de Sola Pool, with the 
collaboration of Harold D. Lasswell, Daniel Lerner, et al. Stanford, California: 
Hoover Institute Studies, Stanford University Press, 1952. Pp. vii, 146. $1.75. 


Natran Maccosy, Boston University 


us book is a report of one study of a large series being carried out by the 

Hoover Institute at Stanford University. Five series of studies are in 
progress, and the present report appears as the second of seven in the series 
on symbols. 

The “Prestige Papers” concern a quantitative analysis of a sample of edi- 
torials of important newspapers of five major powers: the United States, 
England, France, Germany, and Russia. The papers were selected primarily 
on the basis of prestige rather than circulation size. In Germany and Russia 
they are among the so-called official government organs, but in England, 
France, and the United States, they are the papers having the most pres- 
tigeful reputations. In the United States, the New York Times was selected, 
in England, The Times, and in France Le Temps before World War II, and Le 
Monde since the War. In Germany, the papers chosen for the periods before 
the Nazi regime, the Norddeutsche allgemeine Zeitung, 1910-1917, 1918-1920, 
and the Frankfurter Zeitung, 1920-1932, are clearly in the same category, but 
the Vélkischer Beobachter from 1933-1945 and Jzvestia from 1918-1937 are 
better described as official government organs. Novoe Vremis, 1876-1912, was 
the pre-totalitarian newspaper selected for Russia. Each paper chosen for 
study is, with one possible exception (Zavestia vs. Pravda), the most impor- 
tant single newspaper for a given time period in its country, and analysis 
of their editorials provides some extremely interesting information. 

The sampling methods employed are simple but generally sound. For each 
paper yearly samples consisting of the editorials in the twenty-four issues 
published on the first and fifteenth of each month were selected. It perhaps 
would have been preferable to work from a series of random numbers for the 
particular issue within each month, but this reviewer cannot see any real 
likc hood of bias in comparing countries arising from the arbitrary selection 
of two dates employed consistently over approximately a sixty-year period. 

A list of 416 symbols was arbitrarily selected “to reflect trends in world 
politics with particular reference to changing attitudes toward the values of 
democracy, fraternity, security and well-being.” Names of national units 
make up one group of 206 symbols; the other 210 consist of key symbols of 
contending major political ideologies. The selection of symbols for this study 
is based on a previous Hoover Institute study, Lasswell’s The World Revolu- 
tion of Our Times: A Framework for Basic Policy Research, and represents an 
ingenious attempt to get at some basic political concepts through symbol 
analysis. The content analysis involved listing the presence or absence of the 
symbols in each editorial as well as the attitude expressed toward each sym- 
bol. Hypotheses were tested by comparing the incidence of certain classes 
of symbols in the editorials for different periods in the relevant country’s 
political history. Two methods of comparison were used. One had to do with 
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the degree of concentration on a relatively small number of symbols. These 
comparisons were made using Yule’s K, which measures the frequency of 
occurrence of a symbol in such a way that the average value of K is inde- 
pendent of sample size. As the authors note, no sampling distribution of 
Yule’s K has been developed; as a result, no significance tests are available, 

Some very interesting findings result from this type of comparison. In 
general, “totalitarian” papers show a much greater degree of concentration 
on a few key symbols than do “democratic” papers. During war time, 
“democratic” papers show a similar pattern of concentration, but this con- 
centration is confined to a smaller proportion of all the symbols counted, 
A number of other comparisons were made using this method. 

The second method measured symbolic change in editorial content ove: 
time. This method was designed to provide an index of changing political 
values upheld by editorials associated with distinct historical periods in a 
country. For these comparisons, chi-square tests were employed to test the 
null hypothesis that there was no change in symbolic patterns in two or more 
different periods. The least change over time occurred in the British press; 
the most in the Russian. 

The author concludes that in general over the last sixty years there are 
two main trends reflected in this analysis: “(1) a shift in the center of atten- 
tion, in which traditional liberalism is being replaced by proletarian doc- 
trines and (2) a growing threat ‘of war and a corresponding increase of 
nationalism and militarism.” 

It seems to this reviewer that this study represents an ingenious and im- 
portant application of content analysis to problems of political values and 
ideational trends, and represents a forward step in the application of quan- 
titative methods to complex social science problems. 


Heredity in Uterine Cancer. Douglas P. )furphy (Assistant Professor of Ob- 
stetrics and Gynecology and Research Associate, Gynecean Hospital Institute 
of Gynecologic Research, University of Pennsylvania). Cambridge: Harvard Uni- 
versity Press, published for The Commonwealth Fund, 1952. Pp. xi, 128. $2.50, 


B. G. GREENBERG, University of North Carolina 


HIS monograph reports the results of an investigation from 1945 to 1947 

whose primary objective was to measure the role of heredity in the 
occurrence of cancer of the uterine cervix. During the study, it became 
apparent that cancer of the uterine cervix was not always pinpointed in 
diagnosis. Therefore the role of heredity in the less specific diagnosis of can- 
cer of the uterus was substituted. 

The study purports to show that mothers and aunts of persons with cancer 
of the cervix have a slightly greater risk of uterine cancer than otherwise. 
Other female relatives were not similarly affected, nor was cancer of other 
sites higher in the cancer families. 

The preface states that “The chief weakness of the statistical studies 
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(in this field) has been the inadequacy of the control observations. This alone 
was @ sufficient incentive for the undertaking of the present investigation.” 
After reviewing the resulting effort, the inadequacy of the control observa- 
tions still remains the major weakness. 

The method developed was as follows: (a) A group of known cases of 

cancer of cervix was selected from hospitals in and around Philadelphia (201 
cases were usable). (b) All female relatives, living or dead, of each index 
case Or proband, were investigated by field workers and other follow-up 
measures to determine the occurrence of cancer among them. There were 
2809 relatives found and cancer information available for 2424 of them. 
(c) A comparison group, referred to as “control probrands” was also investi- 
gated. They were selected by the combination of a judgment sample and a 
chunk. T cere were 215 index women in the control. Their located reiatives 
numbered 2420, for whom 2796 had cancer information available. 
' The contrel probands originated from three sources: (1) Tb files of a den- 
te! clinie furnished names which resulted in 113 probands. (2) Velunteers 
irom nearby women’s clubs were obtained to raise the socio-economic level 
of the controls. “The purpose of the study was explained at a meeting of 
each club. ... The point was emphasized, when addressing the club mem- 
bers, that the volunteer should not offer herself for interview because cancer 
had oceurred frequently among her relatives.” This group furnished 82 
probands. (3) Casual volunteers provided 20 additional names. Again, this 
group was screened to eliminate those who might have volunteered because 
of unusual familial frequency of cancer. 

It would seem that the selection of controls by the latter two procedures 
was equivalent to stacking the cards. If the women had followed the pre- 
cautions conscientiously, the controls would automatically have too few 
cancerous relatives. If women with high familial cancer rates volunteered 
because of that fact and in spite of the advice, the reverse would hold true. 
It appears from the results that this latter actually occurred. 

Although the investigator recognized this defect himself and pointed it 
out in the text, the admission does not overcome the limitations it imposes. 
To meet this problem, when comparisons of familial cancer between the 
cancer and control groups were made, the latter was divided according to the 
source of origin. But the major comparison involving cancer of the uterus 
provided no such separation of the controls. Also, the method for selecting 
controls resulted in 6 cancer cases among the 215 probands, one of them 
being cancer of the cervix. This tended to dilute the differences, if any, 
between the two groups. 

To obtain valid controls, it might be preferable to disguise the objective 
and call it a family health study. Such concealment can be maintained 
throughout the investigation by buffering the questionnaire with other 
health queries since it is important in many cancer studies not to reveal 
diagnosis to the patie.t. A probability-sample, using stratification and 
double-sampling procedures, could be employed in selecting controls. 
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It is a simple matter of hindsight to criticize an investigation as extensive 
and intensive as this one. The author and his co-workers are to be praised for 
the degree to which they carried the analysis and for many checks on the 
data. Two minor criticisms are offered. 

The first is that it would be desirable when contacting relatives to keep 
the field investigator unaware of the nature of the proband, whether cancer 
or control. Checks were made on the possible effects of this shortcoming. 
It was concluded that the investigators were unbiased in their techniques, 

The second point concerns the advisability of including the proband’s 
aunt by marriage. In general, there is no hereditary implication in a pro- 
band’s mother’s (father’s) brother’s wife. Aunts by marriage represented 
about 10% of the relatives. 

Several typographical errors and inconsistencies appear in the tables and 
text. One difficulty stemmed apparently from tabulating the results at a 
date when information was unavailable, and retabulating at a later date 
after unknown items had been changed. 


a 
at j 
| 
> 
is | 
moe 
| 
\ 
| 
i 


PUBLICATIONS RECEIVED 


Adler, John H., Schlesinger, Eugene R., 
and Westerborg, Evelyn. The Pattern of 
United States Import Trade Since 1923. 
Sore New Index Series and Their Applica- 
tion. Federal Reserve Bank of New York, 
1952. Paper. 

Bachman, George, and Associates. 
Health Resources in the United States; Per- 
sonnel, Facilities, and Services. Washington, 
D. ©.: The Brookings Institution, 1942. 
$5.00. 

Backman, Jules. The Economics of Annu- 
al Improvement Factors. New York Uni- 
versity Schools of Business, 1952. Paper. 

Davidson, Sidney. The Plant Accounting 
Regulations of the Federal Power Commis- 
sion: Michigan Business Studies, Vol. XI, 
No. 1. Ann Arbor: University of Michigan 
Press, 1952, Paper. $2.00. 

Director, Aaron, ed. Defense, Controls, and 
Inflation. Chicago: University of Chicago 
Press, 1952. $3.50. 

England and Wales, The Registrar Gen- 
eral’s Statistical Review for the Year 1950. 
Tables, Part II, Civil. London: H.M. Sta- 
tionery Office, 1952. Paper. 5 shillings. 

Grant, Eugene L. Statistical Quality Con- 


trol, New Second Edition. New York: Mc- 
Graw-Hill Book Company, 1952. $6.50. 
Hagood, Margaret Jarman, and Price, 


Daniel O. Statistics for Sociologists, Re- 
vised Edition. New York: Henry Holt and 
Company, 1952. $5.75. 

Hald, A. Statistical Tables and Formulas. 
New York: John Wiley and Sons, 1952. 
Paper. $2.50. 

. Statistical Theory with Engineer- 
ing Applications. New York: John Wiley 
and Sons, 1952. $9.00. 

Hatt, Paul K. Backgrounds of Human 
Fertility in Puerto Rico: A Sociological 
Survey. Princeton, N. J.: Princeton Uni- 
versity Press, 1952. Paper. $5.00. 

Hickman, W. Braddock. Trends and 
Cycles in Corporate Bond Financing. Occa- 
sional Paper 37. New York: National 
Bureau of Economic Research, 1952. Paper. 

India, Census of. Paper No. 6, Age 


Tables: Uttar Pradesh 1941 on Y-sample, 
Paper 10s. 6d. 

. Paper No. 9, Age Tables: 
Madhya Pradesh 1941 on Y-sainple. Paper. 


4s, 
. Paper No. 10, Age Tables: 
Bombay 1941 on Y-sample. Paper. 3s. 

Kimmel, Lewis H. Share Ownership in 
the United States. Washington, D. C.: The 
Brookings Institution, 1952. Paper. 

Modley, Rudolf, and Lowenstein, Dyno. 
Pictographs and Graphs: How to Make and 
Use Them. New York: Harper and 
Brothers, 1952. $4.00. 

Petersen, William. Some Factors Influ- 
encing Postwar Emigration from the Nether- 
lands. Publication of the Research Group 
for European Migration Problems VI. The 
Hague: Martinus Nijhoff, 1952. Paper. 

Rao, C. Radhakrishna. Advanced Statis- 
tical Methods in Biometric Research. New 
York: John Wiley and Sons, 1952. $7.50. 

Robonsin, Marilyn Druck. Washington 
State Statistical Abstract. Seattle: Univer- 
sity of Washington Press, 1952. Paper. 
$4.50. 


Savage, I. Richard. Notes on Sedimenta- 
tion Models. National Bureau of Standards 
Report 1704. June 1952. Paper. 

Spear, Mary Eleanor. Charting Statistics. 
New York: McGraw-Hill Book Company, 
1952. $4.50. 

Thomsen, Frederick Lundy, and Foote, 
Richard Jay. Agricultvral Prices. New 
York: McGraw-Hill Book Company, 1952. 
$6.50. 

United Nations. Sample Surveys of Cur- 
rent Interest, Fourth Report. Statistical 
Papers Series C, No. 5. New York, March 
1952. Paper. 50 cents. 

. Statistics of Nativnal Income and 
Expenditure. Statistical Papers Series H, 
No. 1. New York, 1952. Paper. 50 cents. 

World Health Organization. Expert Com- 
mittee on Health Statistics, Third Report. 
WHO Technical Report Series No. 53. 
Geneva, July 1952. Paper. 35 cents. 


1952 
sive 
for 
the 
eep 
cer 
ng. 
les, 
ted 
nd 
ta 
ate: 
709 


RANDOM DIGITS (1-6000) 


Below are the first 6000 of a series of random digits which will be 
published as space permits, thus providing frequent fresh supplies. 
The digits are from A Million Random Digits to be published by The 
Rand Corporation, Santa Monica, California, through whose courtesy 
they are published here. 


10097 32533 76520 13586 34673 
37542 ‘04805 64894 74296 24805 
08422 $68953 19645 09303 23209 
99019 02529 69376 70715 28311 
12807 99970 80157 36147 64032 


66065 74717 34072 76850 36697 
31060 10805 45571 82406 35303 
85269 77602 02051 65692 68665 
63573 32135 05325 47048 90553 
73796 45753 03529 64778 35808 


19612 98520 17767 14905 68607 
39141 11805 05431 39808 27732 
64756 83452 99634 06288 98083 
92901 88685 40200 86507 58401 
03551” 99594 67348 87517 64969 


98884 65841 17674 17468 50950 
27369 80124 35635 17727 08015 
59066 74350 99817 77402 77214 
91647 69916 26803 66252 29148 
83605. 09893 20505 14225 68514 


22109 24895 91499 14523 68479 
50725 . 35720 80336 94598 26940 
13746 14141 44104 81949 85157 
36766 27416 12550 73742 11100 
91826 82071 63606 49329 16505 


58047 21445 61196 90446 26457 
45318 72513 15474 45266 95270 
43236 71479 94557 28573 67897 
36936 83210 42481 16213 97344 
46427 63749 23523 78317 73208 


27686 48162 05184 04493 52494 
36858 70297 LOR] 00549 97654 
47954 32979 086 35963 15307 
02040 12860 5U958 59808 08391 
34484 40219 57621 46058 85236 


47774 51924 09282 32179 00597 
79953 59367 23394 69234 61406 
54387 54622 05280 19565 41430 
08721 16868 95491 45155 14938 
89837 68935 78521 94864 31994 


710 


i 
bs 
| 
= 
he 
| 
Fs 
rs. 7 
|| 
‘ 


RANDOM DIGITS 


75246 
64051 
26898 
45427 
01390 


87379 
20117 
01758 
19476 
36168 


24826 
16232 
00406 
49140 
32537 


43902 
45611 
49883 
77928 
94138 


09188 
90045 
73189 
75768 
54016 


08358 
28306 
53840 
91757 
89415 


78430 
77400 
80457 
51878 
90070 


66209 
86882 
75974 
93783 
92419 


68414 
55210 
05636 
94049 
69777 


51926 
28834 
73852 
10123 
34313 


711 
33824 45862 96345 98086 
88159 96119 77963 33185 
09354 33351 07520 80951 
26842 83609 38423 79752 
be 92286 77281 02463 18633 
28. 25241 05567 15880 74029 
he 45204 15956 71926 54178 
. 75379 40419 64425 11664 
y 07246 43667 79782 48324 
10851 34888 35337 69074 
45240 28404 44999 05249 
41941 50949 89435 56463 
96382 70774 20151 96296 
71961 28296 69861 98380 
98145 06571 31010 52567 y 
77557 32270 97790 _ 78498 
80993 37143 05335 49553 . 
52079 84827 59381 32151 
31249 64710 02295 11314 
87637 91976 35584 12364 
20097 32825 39527 04220 
85497 51981 50654 94938 
50207 47677 26269 62290 
76490 20971 87749 90429 
44056 66281 31003 00682 
69910 78542 42785 13661 
03264 81333 10591 40510 
86233 81594 13628 51215 
53741 61613 62269 50263 
92694 00397 58391 12607 . 
77513 03820 . 86864 29901 
19502 37174 69979 20288 
21818 59313 93278 81757 
51474 66499 68107 23621 
99559 68331 62535 24170 
33713 48,07 93584 72869 
85274 86893 11303 22970 
84133 89640 44035 52166 
56732 16234 17395 96131 
65138 56806 87648 85261 
88530 38001 02176 81719 
26556 37402 96397 01304 ; 
53410 97125 40348 87083 
75870 21826 41134 47143 7 
07429 73135 42742 95719 4 
82793 07638 77929 03061 o 
76400 60528 83441 07954 ! 4 
45027 83596 35655 06958 
51466 10850 62746 99599 
95148 39820 98952 43622 r 


712 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER - 
11711 71602 75763 59580 06478 


77586 56271 62546 ' 38508 07341 
31417 21815 21220 30692 70668 
34072 64638 17695 65443 95659 


09035 85794 64547 27267 50264 


18072 96207 25844 91307 06991 
19814 59175 94206 68434 94688 
92983 05128 37470 48908 15877 
10507 13499 97976 06913 45197 
63147 64421 00104 10455 16019 


75569 78800 88835 44579 12883 
23793 48793 90822 31151 21778 
94688 16127 56196 11294 19523 
18288 27437 49632 02309 67245 
13192 72294 07477 65533 60584 


19072 24210 36699 92261 53853 
84473 13622 62126 00819 24637 


54745 24591 35700 28108 83080 
42672 78601 11883 23924 16444 
14210 33712 91342 74538 60790 


97343 65027 61184 04285 29329 
30976 38807 36961 31649 99380 


59515 65122 59659 86283 33121 
52670 35583 16563 79246 36269 
47377 07500 37992 45134 64350 


41377 36066 94850 58838 90830 
38736 74384 89342 52623 24241 
12451 38992 22815 07759 07075 
24334 - 36151 99073 27493 50363 
18157 56178 65762 11161 _ 71210 


03991 10461 93716 16894 66083 
38555 95554 32886 59780 08355 
17546 73704 92052 46215 55121 
32643 52861 95819 06831 00911 
69572 68777 39510 35905 14060 


24122 66591 27699 06494 14845 
61196 30231 92962 61773 41839 


30532 21704 10274 12202 39685 
03788 97599 75867 20717 74416 
48228 63379 85783 47619 53152 


11661 60365 94653 35075 33949 
28000 83799 42402 56623 34442 
08747 32960 07405 36409 83232 
56441 19322 53845 57620 52606 


09483 11220 94747 07399 37408 


06830 31751 57260 68980 
53473 88492 99382 14454 04504 
63335 30934 47744 07481 83828 
64169 22888 48893 27499 98748 
39542 78212 16993 35902 


M 
ig R 
| 
i 
; 
x 
. 
“4 


RANDOM DIGITS 


42614 
34994 
99385 
66497 
48509 


15470 
20094 
73788 
60530 
44372 


52326 
93460 
31792 
87315 
48585 


24326 
93614 
02115 
00080 
63545 


70014 
37239 
18637 
05327 
95096 


43253 
80854 
80088 
80890 
93128 


72484 
18711 
16120 
4235 
28193 


82157 
75363 
_ 09070 
04146 
30552 


94015 
74108 
62880 
11748 
17944 


66067 
54244 
30945 
69170 
08345 


713 
70774 41849 84547 46850 
95596 46352 33049 69248 
38649 11087 96294 14013 
92176 52701 08337 56303 
| 81007 57275 36898 81304 

24831 20857 73156 70284 & 
52225 15633 84924 90415 
76160 92694 48297 39904 
09088 77613 19019 $8152 
94897 38688 32486 45134 
34677 47075 25163 01889 
45305 96892 65251 07629 
59747 00292 36815 43625 
16520 58072 64397 11692 
68652 46850 04515 25624 
79375 79139 83761 60873 

33521 93432 14387 06345 

59589 93622 51321 92246 . 
20554 38306 72472 00008 
59404 18248 05466 55306 
15021 41290 85932 39528 
33295 05870 32364 81616 
37509 82444 23238 07586 
82162 20247 70703 90767 
67946 48460 21199 40188 
84145 60833 17292 34414 
09279 43529 59144 63439 
77074 88722 16554 67049 :, 
18002 94813 49440 79495 : 
18464 74457 44553 91704 
82474 25593 48545 19715 . 
53342 44276 75122 38793 
82641 22820 92904 54196 . 
13574 17200 69902 60014 
29593 88627 94972 16315 
86887 55087 19152 25955 
44989 16822 36024 08150 
93399 45547 94458 83155 | 
52162 90286 54158 26860 
04737 21031 75051 87052 
46874 32444 48277 59820 | 
88222 88570 74015 25704 
87873 95160 59221 22304 
12102 80580 41867 17710 
05600 60478 03343 25852 | 
42792 95043 52680 46780 
91030 45547 70818 59849 
57589 31732 57260 47670 . 
37403 86995 90307 94304 
- 88975 35841 85771 08105 | 


714 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952 


94770 27767 43584 85301 88977 
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Emphasizes the actual application of statistics to business problems. It is the aim 
throughout the book to lead the future businessman to appreciate the usefulness of 
statistical methods and to employ them in the practical problems of business. The 
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A major supplement in the form of a combination problem-book and partial set of 
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By Avsert E. WaucH, University of Connecticut. 531 pages, $5.50 


This text is designed to introduce the student to statistical concepts and nomenclature 
and to encourage him to think in statistical terms. Every effort is made to keep the 
discussion at the beginner’s level and to present basic ideas in such a way that the 
student will find it easy to continue under his own power. 


STATISTICAL TABLES AND PROBLEMS. New 3rd Edition 


By Apert E. WAuGH. 248 pages, $3.00 


Designed for use with any standard text, this manual was prepared to develop in 
the student a real feeling and appreciation of statistical data, in order that he may 
be able to visualize the relationships between the characteristics of the raw data and 
the nature of the final results. The manual contains all the more common tables, 
most frequently used by statisticians. Actual data are given for hundreds of exercises 
and problems in elementary statistics, drawn from a wide variety of fields. 
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UNIVERSITY PRESS 


Cost and Production 
Functions 


By RONALD W. SHEPARD. Here an integrated mathematical 
theory is developed to show the precise relationship between cost 
and production functions of a rationally organized process. In this 
original contribution to mathematical economics, Dr. Shepard, of 
the Rand Corporation, has disclosed new duality relationships with 
various side restrictions and explored the index number aspects of 
the Cobb-Douglas production functions. 

100 pages, planographed, paper, $2.00 


Backgrounds of Human 
Fertility in Puerto Rico 


A SOCIOLOGICAL SURVEY 


By PAUL K;HATT. This study, sponsored by the Social Science 
Research Center of Puerto Rico and the Office of Population Re- 
search of Princeton, examines the relationships between cultural 
factors and fertility in a still underdeveloped but industrializing 
area. 538 pages, $5.00 
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of India and Pakistan 


By KINGSLEY DAVIS. This book draws new and striking conclu- 
sions about the whole economic and social structure of the Indian 
region. “The best technical analysis of the India and Pakistan . 
population problem yet published.”—Middle Eastern Affairs. “In- 
dispensable for those concerned with the past and future of this 
densely populated subcontinent.”—Journal of Political Economy. 
Editorially sponsored by the Princeton Office of Population Re- 
search, Maps and charts. $7.50 
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A NEW BOOK ON STATISTICAL ANALYSIS . . . 


“Statistical Design and Analysis of Experiments for Development Research’’ by Donald 
Sue Villars, Research Scientist, U.S, Naval Ordnance Test Station, Inyokern, China Lake, 

ifornia. 

Based on courses and lectures given at various universities, this book summarizes the funda- 
mental principles involved in the design of efficient experiments, A journalistic style is em- 
ployed so that the reader may almost immediately begin applying various analyses to his own 
data without having to go any deeper into the subject than he desires. Major emphasis is 
—- on the techniques of small sample statistical analysis for use in laboratory and 

evelopment work, rather than on large sample analysis applicable to plant or mass-production 
operation, 


Of particular value is a detailed study of replication degeneracy, a design where several 
different types of treatments are repeated to differing extents. An understanding of the simple 
principle involved in this design will o— the beginner from making erroneous “‘statistical’’ 
conclusions. In addition, the author formulates a rational system for determining the correct 
method of analyzing variance. New and extended tables of variance components are worked out, 
and a detailed description is given of the technique for applying these components to 
the appropriate error estimate for a legitimate significance test. 

Many tables of actual test data and typical calculations are used to illustrate the text. For 
the convenience of the beginner, a comprehensive glossary of terms is included, and the book 
also contains lists of typical problems with their solutions, and both author and subject indices. 
This 54% x 8¥ inch k contains 472 pages, is cloth bound and sells for caly $6.50. Order 
your book on money back guarantee today... . 


Order your copy today 
WM. C. BROWN COMPANY 
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Editor: Lawrence W. Witt 
Michigan State College, East Lansing, Michigan 
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Tractors in the Village—A Study in Turkey 
The New Agricultural Econemics 
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DEMAND ANALYSIS* 
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re 4 covers the general field of applied statistics with pronounced emphasis on 
experimental design. Detailed methods for the statistical analysis of experiments 
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(Formerly titied: Elementary Applied Statistics) 


the acquisition of mathematical skills, it: 
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® Readable and informal style throughout 
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